<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>The mloss.org community blog</title><link>http://mloss.org/community</link><description>Some thoughts about machine learning open source software</description><language>en</language><lastBuildDate>Mon, 22 Dec 2008 17:02:57 -0000</lastBuildDate><item><title>Which programming language on mloss.org?</title><link>http://mloss.org/community/blog/2008/dec/22/which-programming-language/</link><description>&lt;p&gt;The winner is suprisingly (for me at least) C++. And here we all thought that machine learners could only program in matlab. Speaking of matlab, why are there so many projects with matlab sources, but so few with octave?
&lt;/p&gt;
&lt;p&gt;The runner up, R, is mostly automatically obtained from the CRAN servers, so assuming that this is somehow at steady state and assuming that all languages have equal numbers of supporters, we should expect that all the major languages have more or less 40 projects each. Rounding up the top are C, Python and Java.
&lt;/p&gt;
&lt;p&gt;Here's the numbers for the site:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     C:37
 &lt;/li&gt;

 &lt;li&gt;
     C++:49
 &lt;/li&gt;

 &lt;li&gt;
     D:1
 &lt;/li&gt;

 &lt;li&gt;
     Erlang:1
 &lt;/li&gt;

 &lt;li&gt;
     Java:22
 &lt;/li&gt;

 &lt;li&gt;
     lisp:2
 &lt;/li&gt;

 &lt;li&gt;
     Lua:1
 &lt;/li&gt;

 &lt;li&gt;
     Matlab:37
 &lt;/li&gt;

 &lt;li&gt;
     Octave:4
 &lt;/li&gt;

 &lt;li&gt;
     Perl:4
 &lt;/li&gt;

 &lt;li&gt;
     Python:34
 &lt;/li&gt;

 &lt;li&gt;
     R:40
 &lt;/li&gt;

 &lt;li&gt;
     Ruby:2
 &lt;/li&gt;
&lt;/ul&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Cheng Soon Ong</dc:creator><pubDate>Mon, 22 Dec 2008 17:02:57 -0000</pubDate><guid>http://mloss.org/community/blog/2008/dec/22/which-programming-language/</guid></item><item><title>MLOSS Workshop Videos online</title><link>http://mloss.org/community/blog/2008/dec/22/mloss-workshop-videos-online/</link><description>&lt;p&gt;The videos for our MLOSS workshop at Whistler are online at &lt;a href="http://videolectures.net/mloss08_whistler/"&gt;videolectures.net&lt;/a&gt;. Also, the discussions were recorded on video. At around 19:50 in the Video on Reproducibility, you can see me trying very hard to keep focused while not staring at the camera.
&lt;/p&gt;
&lt;p&gt;Happy Holidays, everyone!
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mikio Braun</dc:creator><pubDate>Mon, 22 Dec 2008 15:26:30 -0000</pubDate><guid>http://mloss.org/community/blog/2008/dec/22/mloss-workshop-videos-online/</guid></item><item><title>On our NIPS workshop</title><link>http://mloss.org/community/blog/2008/dec/19/on-our-nips-workshop/</link><description>&lt;p&gt;On December 12, our third workshop on machine learning open source
   software was held in Whistler, BC, Canada. It featured two invited
   speakers, a host of new and exciting software projects on machine
   learning, and two interesting discussions where we tried to initiate
   new developments.
&lt;/p&gt;
&lt;p&gt;We were very glad to get two speakers from very prestigious and
   successful projects: John W. Eaton from &lt;a href="http://www.octave.org/"&gt;octave&lt;/a&gt;, a matlab clone,
   and John D. Hunter from &lt;a href="http://matplotlib.sourceforge.net/"&gt;matplotlib&lt;/a&gt;, a plotting library for
   matlab.
&lt;/p&gt;
&lt;p&gt;John W. Eaton gave valuable insights into his experiences in running
   an open source project. Started in 1989 as companion software to a
   book on chemical reactions, its main intention was to give students
   something which is more accessible than Fortran. Only afterwards were
   people realizing that octave was very close to matlab, and over the
   years people were requesting better and better compatibility to
   matlab. The last major release, version 3, has brought even fuller
   compatibility with the support of sparse matrices, and a complete
   overhaul of the plotting functionalities. Still, octave is searching
   for help, in particular in the areas of documentation, mailing list
   maintenance, and packaging. So if you're interested, drop John a line.
&lt;/p&gt;
&lt;p&gt;Matplotlib by John D. Hunter also started as a private project. John
   worked on epilepsy research in neurophysiology and initially wrote
   what would become matplotlib to display brain waves together with
   related data. At some point, matplotlib has started to become so big
   that it practially required all of his time. By now, John is working
   in the finance industry but has an agreement with his employer to work
   on matplotlib a certain fraction of his time. We have also learned
   that matplotlib contains a full re-implementation of the TeX
   algorithms by Donald Knuth for rendering annotations in the plot.
&lt;/p&gt;
&lt;p&gt;Both speakers stressed the importance of being resilient and pointed
   out that they both had to go through some time (might even be years)
   before a project really takes off. Both also shared their insights on
   how difficult it can be to deal with users. On the one hand, you have
   to be reliable to build up trust in your project, on the other hand,
   there are always some users who expect full support basically for
   free and are unwilling to contribute.
&lt;/p&gt;
&lt;p&gt;Besides those two invited talks, we again had a number of interesting
   projects. The submissions this year could roughly be classified into
   full frameworks, projects which focus on a special type of application
   or algorithm, and infrastructure.
&lt;/p&gt;
&lt;p&gt;We had four different projects which are providing a full-blown
   environment for doing machine learning and statistical data analysis.
   The first talk was &lt;a href="http://mloss.org/software/view/128/"&gt;Torch&lt;/a&gt;, a full blown matlab-replacement
   written in a combination of &lt;a href="http://www.lua.org/"&gt;Lua&lt;/a&gt; and C++. Torch is optimized for
   efficiency and large scale learning and comes with its own matrix
   classes (called tensors), and plotting routines. &lt;a href="http://mloss.org/software/view/70/"&gt;Shark&lt;/a&gt; is a
   similarly feature rich framework written in C++. For users of R, there
   is [kernlab][ker] which focuses on kernel methods. Finally, python was
   represented by &lt;a href="http://mloss.org/software/view/66/"&gt;mlpy&lt;/a&gt;, and &lt;a href="http://mloss.org/software/view/60/"&gt;mdp&lt;/a&gt;, which sported an
   innovative module architecture which allows to plug together data
   processing modules. It was very interesting to see that there exist so
   many different projects which have such a broad scope. It was also
   quiet interesting to learn that these projects weren't so much aware
   of one another.
&lt;/p&gt;
&lt;p&gt;Projects which were more focused on a smaller scale included
   &lt;a href="http://mloss.org/software/view/49/"&gt;Nieme&lt;/a&gt;, which contains algorithms for energy-based learning,
   &lt;a href="http://mloss.org/software/view/77/"&gt;libDAI&lt;/a&gt;, a library of inference algorithms for graphical models
   with discrete state spaces, and &lt;a href="http://mloss.org/software/view/137/"&gt;Model Monitor&lt;/a&gt;, a tool for
   assessing the amount of distribution shift in the data and sensitivity
   of algorithms under distribution shift. The &lt;a href="http://mloss.org/software/view/145/"&gt;BCPy&lt;/a&gt; project again
   provides a python layer over the &lt;a href="http://www.bci2000.org/"&gt;BCI2000&lt;/a&gt; system and allows to
   work with the later in a much more flexible manner.
&lt;/p&gt;
&lt;p&gt;Finally, we had projects which dealt with different aspects of
   infrastructure. The &lt;a href="http://mloss.org/software/view/151/"&gt;RL Glue&lt;/a&gt; project provides a general
   framework to connect environments and learners in a reinforcement
   learning framework. This project has been highly successful, and is
   the standard platform for a number of challenges in this
   area. &lt;a href="http://mloss.org/software/view/140/"&gt;Disco&lt;/a&gt; implements the map-reduce framework for distributed
   clustering in a particularly elegant manner for python users, based on
   a core is written in &lt;a href="http://www.erlang.org/"&gt;Erlang&lt;/a&gt;. The &lt;a href="http://mloss.org/software/view/154/"&gt;Experiment Databases for
Machine learning&lt;/a&gt; and &lt;a href="http://mloss.org/software/view/146/"&gt;BenchMarking Via Weka&lt;/a&gt; projects
   address the issue of benchmarking machine learning algorithms in an
   automatic and reproducible way and providing a database to describe
   models and experimental results.
&lt;/p&gt;
&lt;p&gt;In summary, it seems that researchers are quite active in providing
   feature-rich high-quality open source software on machine
   learning. The large number of 23 submissions to this workshop also
   provides evidence for that. At the same time, it seems that most
   projects are still oblivious of each other. In particular, when it
   comes to interoperability, it seems that there is still a lot missing,
   making it hard to combine algorithms written in different languages,
   or code developed with respect to different frameworks.
&lt;/p&gt;
&lt;p&gt;Therefore, one of the discussions was focused on the question of
   interoperability. As a starting point, we proposed the ARFF file
   format as a common file format for exchanging data. Such a file format
   could serve as an important first step. Leaving more complex solutions
   like remote method invocation or CORBA aside, a common data format is
   really the simplest way to exchange data between two pieces of code
   which might be written in different languages or run on different
   platforms. As we expected, the discussion was quite lively, as the
   number of possible data formats is large, and the different features
   you could want are not always compatible. But I think what we achieved
   was to raise awareness for the need of interoperability. Hopefully,
   people will start to think about how their code could interact with
   other code, and standards will emerge over time.
&lt;/p&gt;
&lt;p&gt;The other discussion addressed an even more difficult question, namely
   that of reproducibility. How can we ensure that somebody else can
   reproduce the experimental results from a machine learning paper? An
   interesting suggestion was to require that the software producing the
   results is provided on a bootable live CD like a Ubuntu install CD to
   really make sure that the environment in which the experiments were
   done can be set up easily. The question was also whether you want to
   be able to reproduce the results at publication time, or even after
   ten years. Again, there is the problem of how to describe and store
   results in a database. Here also, we did not arrive at a conclusion,
   but the overall awareness could be raised hopefully.
&lt;/p&gt;
&lt;p&gt;Overall, I think the workshop was very successful and
   interesting. Room for improvement is always there, of course. For
   example, we should make sure not to forget to schedule coffee breaks
   next time. Also, I think we should put more emphasis on the community
   building aspect and less on individual projects. In 2006, the topic
   was so new that people didn't know what kinds of projects were out
   there, but now, also due to this website, the existence of open source
   software for machine learning is much more known. So giving projects a
   platform to advertise their software is certainly an important part,
   but thinking about what the next step is and talking about how to
   integrate what we already have is something I would put more emphasis
   on next time.
&lt;/p&gt;
&lt;p&gt;Again I (and Soeren and Cheng as well) would like to thank everybody
   who contributed to this workshop, and of course also the &lt;a href="http://pascallin2.ecs.soton.ac.uk/"&gt;Pascal2&lt;/a&gt;
   framework for their financial support. 
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mikio Braun</dc:creator><pubDate>Fri, 19 Dec 2008 17:35:31 -0000</pubDate><guid>http://mloss.org/community/blog/2008/dec/19/on-our-nips-workshop/</guid></item><item><title>Help needed</title><link>http://mloss.org/community/blog/2008/dec/08/help-needed/</link><description>&lt;p&gt;NIPS is a lot like Christmas. You travel to some place once a year, meet people (some you like, others you don't), and eat a lot.
&lt;/p&gt;
&lt;p&gt;To the point of this entry. If there are any budding Django programmers out there who would like to help out with the development of mloss.org, please come and talk to us. We will have a t-shirt handing out table again at NIPS, so please come by and have a chat with myself, Soeren or Mikio.
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Cheng Soon Ong</dc:creator><pubDate>Mon, 08 Dec 2008 18:27:34 -0000</pubDate><guid>http://mloss.org/community/blog/2008/dec/08/help-needed/</guid></item><item><title>MLOSS progress updates for November 2008</title><link>http://mloss.org/community/blog/2008/nov/27/mloss-progress-updates-for-november-2008/</link><description>&lt;p&gt;Two months passed since the last statistics update, so lets see if we are progressing:
&lt;/p&gt;
&lt;p&gt;As of today &lt;a href="http://mloss.org"&gt;mloss.org&lt;/a&gt; has 
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     158 software projects based on
 &lt;/li&gt;

 &lt;li&gt;
     19 programming languages,
 &lt;/li&gt;

 &lt;li&gt;
     302 authors (including software co-authors),
 &lt;/li&gt;

 &lt;li&gt;
     284 registered users,
 &lt;/li&gt;

 &lt;li&gt;
     63 comments (including spam :),
 &lt;/li&gt;

 &lt;li&gt;
     109 forum posts,
 &lt;/li&gt;

 &lt;li&gt;
     28 blog entries,
 &lt;/li&gt;

 &lt;li&gt;
     51 software ratings,
 &lt;/li&gt;

 &lt;li&gt;
     31525 software statistics objects,
 &lt;/li&gt;

 &lt;li&gt;
     143 software subscriptions or bookmarks.
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And happy birthday mloss.org - the site is live for 1 year and 1.5 months now  and since we became a recent target of spammers it might show that mloss.org is not that unimportant anymore. This is also documented by a traffic growth from around 300 visits per week (February 2008) to more than 1000 per week (November 2008).
&lt;/p&gt;
&lt;p&gt;And congratulations &lt;a href="www.kyb.mpg.de/~pgehler"&gt;Peter Gehler&lt;/a&gt;, author of the most successful software project: &lt;a href="http://mloss.org/software/view/48/"&gt;MPIKmeans&lt;/a&gt; (accessed more than 6000 times).
&lt;/p&gt;
&lt;p&gt;Finally &lt;a href="http://jmlr.csail.mit.edu/mloss/"&gt;JMLR-MLOSS&lt;/a&gt; received
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     20 submissions until now,
 &lt;/li&gt;

 &lt;li&gt;
     5 resubmissions;
 &lt;/li&gt;

 &lt;li&gt;
     3 are already accepted and published,
 &lt;/li&gt;

 &lt;li&gt;
     1 is pending publication
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;since its announcement in summer 2007.
&lt;/p&gt;
&lt;p&gt;One may conclude - there is visible progress. However, as already pointed out in several previous blog posts - we merely see several isolated mloss projects that don't at all inter-operate with each other. And it is clear that this trend needs to be stopped, but how could we support the next steps? In case you have some bright ideas either talk to us at NIPS*08 (possibly even attend the &lt;a href="http://mloss.org/workshop/nips08/"&gt;workshop&lt;/a&gt; and present your ideas in the discussion) or leave a comment...
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Soeren Sonnenburg</dc:creator><pubDate>Thu, 27 Nov 2008 16:24:17 -0000</pubDate><guid>http://mloss.org/community/blog/2008/nov/27/mloss-progress-updates-for-november-2008/</guid></item><item><title>Open Source is not Interoperability</title><link>http://mloss.org/community/blog/2008/nov/26/open-source-is-not-interoperability/</link><description>&lt;p&gt;Trying to prepare some thoughts about interoperability to be discussed at the &lt;a href="http://mloss.org/workshop/nips08"&gt;NIPS workshop&lt;/a&gt;, I came across a bunch of websites roughly in the following order:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;p&gt;a rather negative article about the state of open source software and how they interoperate &lt;a href="http://www.eweek.com/c/a/Linux-and-Open-Source/Interoperability-Still-Stumbling-Block-for-Open-Source-in-2008/"&gt;at eweek.com&lt;/a&gt;.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;a very positive blog at &lt;a href="http://www.ugotrade.com/2008/09/09/open-source-and-interoperability-will-take-virtual-worlds-mainstream/"&gt;ugotrade&lt;/a&gt; which talks about &lt;a href="http://opensimulator.org/wiki/Main_Page"&gt;OpenSim&lt;/a&gt; and how it will be the next hot thing.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;a &lt;a href="http://blogs.msdn.com/dotnetinterop/archive/2008/06/18/open-source-and-interoperability.aspx"&gt;post&lt;/a&gt; about why open source and interoperability are really two different things.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Quoting the third author:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     Interop is not open source.
 &lt;/li&gt;

 &lt;li&gt;
     Interop does not require open source implementations
 &lt;/li&gt;

 &lt;li&gt;
     Open source does not guarantee Interop 
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;While one thinks that somehow it is natural for open source developers to make use of other bits of (open source) software, it usually doesn't happen. For me, interoperability can occur in two ways: the first being having a common set of protocols (as argued for by the third post above), and/or the second which is integrating another software library or method. In some sense, the "integration" idea also requires a set of protocols or APIs. It may be that I'm just being pedantic about trying to semantically differentiate between protocols and APIs. But the main idea remains: &lt;strong&gt;We need software that talks to other bits of software&lt;/strong&gt;. 
&lt;/p&gt;
&lt;p&gt;However, if both pieces of software are open source, we can do more than just have software that talks to other bits of software (which is why OpenSim is raising so much interest). In the process of having to push together two software projects, we may be able to come up with &lt;a href="http://en.wikipedia.org/wiki/Software_componentry"&gt;better interfaces&lt;/a&gt; between them. This is especially true in the research area (which in some sense practices &lt;a href="http://www.swc.scipy.org/"&gt;carpentry&lt;/a&gt;) where it is not that clear from the start how programs should interact. For supervised machine learning, datasets are a good place to start. It seems "obvious" that this is one place where different machine learning algorithms can interface with each other. Even in this "simple" interface, there is a multitude of data formats and standards. Another quite fruitful area is in convex optimization, where there are several projects (even here on mloss.org) which easily link to different back ends, or several solvers which are used by various front ends. Interestingly, here the interfaces are actually dictated by the mathematics, and the software implementations are just mirroring these forms. I think it is within our reach to have these kinds of interoperability for many other areas of machine learning.
&lt;/p&gt;
&lt;p&gt;As for the long term goal of software systems being well integrated in the application specific fashion, I think we still have a way to go yet...
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Cheng Soon Ong</dc:creator><pubDate>Wed, 26 Nov 2008 14:03:41 -0000</pubDate><guid>http://mloss.org/community/blog/2008/nov/26/open-source-is-not-interoperability/</guid></item><item><title>mloss08 Program</title><link>http://mloss.org/community/blog/2008/nov/06/mloss08-program/</link><description>&lt;p&gt;Just in case you haven't checked our &lt;a href="http://mloss.org/workshop/nips08/"&gt;workshop page&lt;/a&gt; recently, we have finalised our program. We had a surprisingly large number of submissions, ranging from quite mature projects to small radical ideas. In the end, we decided that we should try to squeeze in as many projects as possible, and at the same time try to keep some diversity in the program; i.e. we didn't want to have all slots taken up by large mature machine learning frameworks.
&lt;/p&gt;
&lt;p&gt;Our theme this year is "interoperability, interoperability, interoperability". The dream is to have some way for machine learning software to talk to each other. We are still a long way from being able to plug and play different tools for machine learning, and we hope to make a start by discussing this at the workshop. Of course, machine learning research is not only about software, but it is also about the data. Our afternoon discussion session will be about "UCI 2.0", and how we should go about it. There was a recent &lt;a href="http://www.nature.com/ncb/journal/v10/n10/full/ncb1008-1123.html"&gt;editorial&lt;/a&gt; in Nature Cell Biology about the need for standardizing bioinformatics data, and &lt;a href="http://peanutbutter.wordpress.com/2008/10/30/the-triumvirate-of-scientific-data/"&gt;this blog post&lt;/a&gt; highlights three properties of scientific data.
&lt;/p&gt;
&lt;p&gt;Hope to see you at NIPS!
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Cheng Soon Ong</dc:creator><pubDate>Thu, 06 Nov 2008 15:01:13 -0000</pubDate><guid>http://mloss.org/community/blog/2008/nov/06/mloss08-program/</guid></item><item><title>Reviewing software</title><link>http://mloss.org/community/blog/2008/oct/13/reviewing-software/</link><description>&lt;p&gt;The review process for the current NIPS workshop &lt;a href="http://mloss.org/workshop/nips08/"&gt;mloss08&lt;/a&gt; is now underway. There are a couple of interesting thoughts that I had while discussing this process with Soeren and Mikio, as well as some of the program committee. The two issues are:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     Who should review a project?
 &lt;/li&gt;

 &lt;li&gt;
     What are the review criteria?
 &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Reviewer Choice&lt;/h2&gt;
&lt;p&gt;Unlike standard machine learning projects, choosing a reviewer for a mloss project has to be comfortable with three different aspects of the system, namely:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     The machine learning problem (e.g. Graphical models, kernel methods, or reinforcement learning)
 &lt;/li&gt;

 &lt;li&gt;
     The programming language, or at least the paradigm (e.g. object oriented programming)
 &lt;/li&gt;

 &lt;li&gt;
     The operating environment (which may be a particular species of make on a version of Linux)
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There is also projects about a particular application area of machine learning, such as brain-computer interfaces which put an additional requirement on the understanding of the reviewer.
&lt;/p&gt;
&lt;p&gt;However, if one looks at the set of people who satisfy all those criteria for a particular project, one usually ends up with only a handful of potential researchers, most of which would have a conflict of interest with the submitted project. So, often I would choose a reviewer who is an expert in one of the three areas and hope that he or she would be able to figure out the rest. Is there a better solution?
&lt;/p&gt;

&lt;h2&gt;Review Criteria&lt;/h2&gt;
&lt;p&gt;The &lt;a href="http://jmlr.csail.mit.edu/mloss/mloss-info.html"&gt;JMLR review criteria&lt;/a&gt; are:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     The quality of the four page description.
 &lt;/li&gt;

 &lt;li&gt;
     The novelty and breadth of the contribution.
 &lt;/li&gt;

 &lt;li&gt;
     The clarity of design.
 &lt;/li&gt;

 &lt;li&gt;
     The freedom of the code (lack of dependence on proprietary software).
 &lt;/li&gt;

 &lt;li&gt;
     The breadth of platforms it can be used on (should include an open-source operating system).
 &lt;/li&gt;

 &lt;li&gt;
     The quality of the user documentation (should enable new users to quickly apply the software to other problems, including a tutorial and several non-trivial examples of how the software can be used).
 &lt;/li&gt;

 &lt;li&gt;
     The quality of the developer documentation (should enable easy modification and extension of the software, provide an API reference, provide unit testing routines).
 &lt;/li&gt;

 &lt;li&gt;
     The quality of comparison to previous (if any) related implementations, w.r.t. run-time, memory requirements, features, to explain that significant progress has been made. 
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This year's workshop has the theme of interoperability and coorperation. Therefore it is also a review criteria. The important question is how to weight the different aspects? The answer is not at all clear. There is a basic level of adherence which is necessary for each of the criteria, above which is it difficult to trade off the different aspects quantitatively. For example does very good user documentation excuse very poor code design? Does being able to run on many different operating systems excuse very poor run time memory and computational performance?
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Put your comments below or come to this year's workshop and discuss this!&lt;/strong&gt;
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Cheng Soon Ong</dc:creator><pubDate>Mon, 13 Oct 2008 09:40:59 -0000</pubDate><guid>http://mloss.org/community/blog/2008/oct/13/reviewing-software/</guid></item><item><title>GNU Octave on Free Software Foundations High Priority List</title><link>http://mloss.org/community/blog/2008/oct/06/gnu-octave-on-free-software-foundations-high-prior/</link><description>&lt;p&gt;The &lt;a href="http://www.fsf.org/"&gt;Free Software Foundation (FSF)&lt;/a&gt; maintains a high priority list of software projects and can be found &lt;a href="http://www.fsf.org/campaigns/priority.html"&gt;here&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;Quoting the FSF:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The FSF high-priority projects list serves to foster the development of projects that are important
   for   increasing the adoption and use of free software and free software operating systems.
   [...]
   Some of the most important projects on our list are replacement projects. These projects are
   important because they address areas where users are continually being seduced into using
   non-free software by the lack of an adequate free replacement.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;With &lt;a href="http://www.fsf.org/campaigns/priority.html#gnuoctave"&gt;rank eight&lt;/a&gt; among the top ten prioritized software projects is &lt;a href="http://www.octave.org"&gt;GNU Octave&lt;/a&gt; --- a free software Matlab replacement. 
&lt;/p&gt;
&lt;p&gt;As this is very relevant to our community that is strongly dominated by &lt;a href="http://www.mathworks.com/products/matlab"&gt;Matlab&lt;/a&gt;, I would like to encourage everyone to try out octave 3.0. If you tried octave 2.x or any earlier version at some point, it really matured &lt;em&gt;a lot&lt;/em&gt;. It supports all the data types like cell arrays, dense or sparse arrays you know from matlab and yes it has all these plotting functions like plot, surf3d etc too. And if you ever tried to extend matlab using C code, support is really much better from the octave side not to mention the killer feature: Octave is fully supported by &lt;a href="http://www.swig.org"&gt;swig&lt;/a&gt;! Still not convinced? We will have John W. Eaton to introduce octave to us at the &lt;a href="http://mloss.org/workshop/nips08"&gt;NIPS'08 MLOSS Workshop&lt;/a&gt;. So what are you waiting for, give octave a try and see &lt;a href="http://www.gnu.org/software/octave/help-wanted.html"&gt;how you can help&lt;/a&gt;!
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Soeren Sonnenburg</dc:creator><pubDate>Mon, 06 Oct 2008 16:46:04 -0000</pubDate><guid>http://mloss.org/community/blog/2008/oct/06/gnu-octave-on-free-software-foundations-high-prior/</guid></item><item><title>Differences between paid and volunteer FOSS contributors</title><link>http://mloss.org/community/blog/2008/oct/03/differences-between-paid-and-volunteer-foss-contri/</link><description>&lt;p&gt;I just stumbled across a very interesting article titled &lt;a href="https://fossbazaar.org/?q=content/differences-between-paid-and-volunteer-foss-contributors"&gt;Differences between paid and volunteer FOSS contributors&lt;/a&gt; that I am going to almost fully quote below. The original article was written by &lt;a href="http://www.cyrius.com/"&gt;Martin Michlmayr&lt;/a&gt; and can be found &lt;a href="https://fossbazaar.org/?q=content/differences-between-paid-and-volunteer-foss-contributors"&gt;here&lt;/a&gt;. Almost full quote follows:
&lt;/p&gt;
&lt;p&gt;There's a lot of debate these days about the impact of the increasing number of paid developers in FOSS communities that started as volunteer efforts and still have significant numbers of volunteers. &lt;a href="http://www.sussex.ac.uk/Users/eb32/"&gt;Evangelia Berdou's&lt;/a&gt; &lt;a href="http://opensource.mit.edu/papers/PhD_Berdou.pdf"&gt;PhD thesis&lt;/a&gt; "Managing the Bazaar: Commercialization and peripheral participation in mature, community-led Free/Open source software projects" contains a contains a wealth of information and insights about this topic.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.sussex.ac.uk/Users/eb32/"&gt;Berdou&lt;/a&gt; conducted interviews with members of the &lt;a href="http://www.gnome.org"&gt;GNOME&lt;/a&gt; and &lt;a href="http://www.kde.org"&gt;KDE&lt;/a&gt; projects. She found that paid developers are often identified with the core developer group which is responsible for key infrastructure and often make a large number of commits. Furthermore, she suggested that the groups may have different priorities: "whereas [paid] developers focus on technical excellence, peripheral contributors are more interested in access and practical use".
&lt;/p&gt;
&lt;p&gt;Based on these interviews, she formulated the following hypotheses which she subsequently analyzed in more detail:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     Paid developers are more likely to contribute to critical parts of the code base.
 &lt;/li&gt;

 &lt;li&gt;
     Paid developers are more likely to maintain critical parts of the code base.
 &lt;/li&gt;

 &lt;li&gt;
     Volunteer contributors are more likely to participate in aspects of the project that are geared towards the end-user.
 &lt;/li&gt;

 &lt;li&gt;
     Programmers and peripheral contributors are not likely to participate equally in major community events.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Berdou found all hypotheses to be true for GNOME but only hypothesis two and four were confirmed for KDE.
&lt;/p&gt;
&lt;p&gt;In the case of GNOME, Berdou found that hired developers contribute to the most critical parts of the project, that they maintained most modules in core areas and that they maintained a larger number modules than volunteers. Two important differences were found in KDE: paid developers attend more conferences and they maintain more modules.
&lt;/p&gt;
&lt;p&gt;Berdou's research contains a number of important insights:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     Corporate contributions are important because paid developers contribute a lot of changes, and they maintain core modules and code.
 &lt;/li&gt;

 &lt;li&gt;
     While it's clear that the involvement of paid contributors is influenced by the strategy of their company, Berdou wonders whether another reason why they often contribute to core code is because they "develop their technical skills and their understanding of the code base to a greater extent than volunteers who usually contribute in their free time". It's therefore important that projects provide good documentation and other help so volunteers can get up to speed quickly.
 &lt;/li&gt;

 &lt;li&gt;
     Since many volunteers cannot afford to attend community events, projects should provide travel funds. This is something I see more and more: for example, Debian funds some developers to attend Debian conference and the Linux Foundation has a grant program to allow developers to attend events.
 &lt;/li&gt;

 &lt;li&gt;
     Paid developers often maintain modules they are not paid to directly contribute to. A reason for this is that they continue to maintain modules in their spare time when their company tells them to work on other parts of the code.
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The rest of the article can be found &lt;a href="https://fossbazaar.org/?q=content/differences-between-paid-and-volunteer-foss-contributors"&gt;here&lt;/a&gt;. 
&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Soeren Sonnenburg</dc:creator><pubDate>Fri, 03 Oct 2008 08:53:25 -0000</pubDate><guid>http://mloss.org/community/blog/2008/oct/03/differences-between-paid-and-volunteer-foss-contri/</guid></item></channel></rss>