Open Thoughts

November 2010 archive

Free your code

November 26, 2010

"Not sharing your code basically adds an additional burden to others who may try to review and validate your work", as John Locke was quoted in a recent article in the Communications of the ACM. Of course there is the flip side to this in our competitive academic environment. As Scott A. Hissam puts it "... The academic community earns needed credentialing by producing original publications. Do you give up the software code immediately? Or do you wait until you've had a sufficient number of publications? If so, who determines what a sufficient number is?"

In a data driven computational field like machine learning, many of our results are dependent on some sort of calculation. Yes, in principle, many methods could be implemented from scratch based on a set of equations, but in practice, most people do not have the time (or the capability) to code up all prior art from scratch. In some sense good code (like a good waiter/waitress) remains in the background. My favourite example is all the linear algebra software that is common in many programming environments. Most people don't even think about the numerical complexities of finding eigenvalues since there is a "built in" function for it. This would not have been possible without the BLAS and LAPACK open source projects. So, write code, and make it open source.

"But I don't write good code..."

Nick Barnes from the Climate Code Foundation argues that you should release it anyway. In a recent opinion piece by Nick and also other famous people in a Nature News article, gives many reasons why code should be open. In his blog piece, he gives more points. Among them:

  • publication on its own is not enough
  • software skills are important and must be funded
  • open development is important
  • the longest program starts with a single line of code