November 2009 archive
Matlab(tm) 7.3 file format is actually hdf5 and can be read from other languages like python
November 19, 2009
It looks like that matlab version 7.3 and later are capable of writing out objects in the so called matlab 7.3 file format. While at first glance it looks like another proprietary format - it seems to be in fact the Hierarchical Data Format version 5 or in short hdf5.
So you can do all sorts of neat things:
-
Lets create some matrix in matlab first and save it:
>> x=[[1,2,3];[4,5,6];[7,8,9]] x = 1 2 3 4 5 6 7 8 9 >> save -v7.3 x.mat x
-
Lets investigate that file from the shell:
$ h5ls x.mat x Dataset {3, 3} $ h5dump x.mat HDF5 "x.mat" { GROUP "/" { DATASET "x" { DATATYPE H5T_IEEE_F64LE DATASPACE SIMPLE { ( 3, 3 ) / ( 3, 3 ) } DATA { (0,0): 1, 4, 7, (1,0): 2, 5, 8, (2,0): 3, 6, 9 } ATTRIBUTE "MATLAB_class" { DATATYPE H5T_STRING { STRSIZE 6; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "double" } } } } }
-
And load it from python:
>>> import h5py >>> import numpy >>> f = h5py.File('x.mat') >>> x=f["x"] >>> x <HDF5 dataset "x": shape (3, 3), type "<f8"> >>> numpy.array(x) array([[ 1., 4., 7.], [ 2., 5., 8.], [ 3., 6., 9.]])
So it seems actually to be a good idea to use matlab's 7.3 format for interoperability.
How many NIPS papers have source code?
November 12, 2009
With NIPS coming up next month, I'm curious as to how many of the authors would distribute source code corresponding to their NIPS papers. Since the 2009 papers are not yet available, I wrote a small python script to check out the number of papers having http or ftp links in the 2008 batch. The results? 5 papers reported by the script.
- NIPS2008_1027.pdf
- NIPS2008_0552.pdf
- NIPS2008_0117.pdf
- NIPS2008_0604.pdf
- NIPS2008_0401.pdf
The search was pretty basic, so I probably detected some false positives, and missed others. Here's the python script if you want to refine the search mloss_detect.py. I obtained the papers from the electronic proceedings. Warning, this is 130MB.
Online Petition
November 11, 2009
There is an ongoing Petition that is trying to persuade the German parliament (Bundestag) to make all research publications that are a result of public funding freely available.
A press realease is available from the Coalition for Action "Copyright for Education and Research" from here. Unfortunately it is in German.
Consider signing the online petition Wissenschaft und Forschung - Kostenloser Erwerb wissenschaftlicher Publikationen at the German parliament website. Note that the deadline for signatures is December 22 2009.
MLOSS progress updates for November 2009
November 11, 2009
As of today mloss.org has
- 211 software projects with 357 revisions based on
- 23 programming languages,
- 370 authors (including software co-authors),
- 365 registered users,
- 572 comments (including spam :),
- 109 forum posts,
- 51 blog entries,
- 67 software ratings,
- 90839 software statistics objects,
- 143 software subscriptions or bookmarks.
And happy birthday mloss.org - the site is live for 2 year and 1.5 months now and is steadily visited by 1200 users per week (November 2009).
And congratulations Peter Gehler, author of the most successful software project: MPIKmeans (accessed more than 11837 times).
Finally in JMLR-MLOSS 10 papers got accepted since its announcement in summer 2007.
Yes visible progress. Nevertheless, does anyone have suggestions on how we should/could improve mloss.org (or even wants to help out)? I guess we should have another workshop next year? Maybe this time not at nips but ICML?
The one thing I would like to see is blog contributions from you. Whenever you stumble across something opensource and machine learning related, write any of us an email and we will put your post in this blog.
Waiting for your ideas either talk to us at any of the conferences we are attending or leave comment!