Reviewing software
Posted by Cheng Soon Ong on October 13, 2008
The review process for the current NIPS workshop mloss08 is now underway. There are a couple of interesting thoughts that I had while discussing this process with Soeren and Mikio, as well as some of the program committee. The two issues are:
- Who should review a project?
- What are the review criteria?
Reviewer Choice
Unlike standard machine learning projects, choosing a reviewer for a mloss project has to be comfortable with three different aspects of the system, namely:
- The machine learning problem (e.g. Graphical models, kernel methods, or reinforcement learning)
- The programming language, or at least the paradigm (e.g. object oriented programming)
- The operating environment (which may be a particular species of make on a version of Linux)
There is also projects about a particular application area of machine learning, such as brain-computer interfaces which put an additional requirement on the understanding of the reviewer.
However, if one looks at the set of people who satisfy all those criteria for a particular project, one usually ends up with only a handful of potential researchers, most of which would have a conflict of interest with the submitted project. So, often I would choose a reviewer who is an expert in one of the three areas and hope that he or she would be able to figure out the rest. Is there a better solution?
Review Criteria
The JMLR review criteria are:
- The quality of the four page description.
- The novelty and breadth of the contribution.
- The clarity of design.
- The freedom of the code (lack of dependence on proprietary software).
- The breadth of platforms it can be used on (should include an open-source operating system).
- The quality of the user documentation (should enable new users to quickly apply the software to other problems, including a tutorial and several non-trivial examples of how the software can be used).
- The quality of the developer documentation (should enable easy modification and extension of the software, provide an API reference, provide unit testing routines).
- The quality of comparison to previous (if any) related implementations, w.r.t. run-time, memory requirements, features, to explain that significant progress has been made.
This year's workshop has the theme of interoperability and coorperation. Therefore it is also a review criteria. The important question is how to weight the different aspects? The answer is not at all clear. There is a basic level of adherence which is necessary for each of the criteria, above which is it difficult to trade off the different aspects quantitatively. For example does very good user documentation excuse very poor code design? Does being able to run on many different operating systems excuse very poor run time memory and computational performance?
Put your comments below or come to this year's workshop and discuss this!
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.