Matlab(tm) 7.3 file format is actually hdf5 and can be read from other languages like python
Posted by Soeren Sonnenburg on November 19, 2009
It looks like that matlab version 7.3 and later are capable of writing out objects in the so called matlab 7.3 file format. While at first glance it looks like another proprietary format - it seems to be in fact the Hierarchical Data Format version 5 or in short hdf5.
So you can do all sorts of neat things:
-
Lets create some matrix in matlab first and save it:
>> x=[[1,2,3];[4,5,6];[7,8,9]] x = 1 2 3 4 5 6 7 8 9 >> save -v7.3 x.mat x
-
Lets investigate that file from the shell:
$ h5ls x.mat x Dataset {3, 3} $ h5dump x.mat HDF5 "x.mat" { GROUP "/" { DATASET "x" { DATATYPE H5T_IEEE_F64LE DATASPACE SIMPLE { ( 3, 3 ) / ( 3, 3 ) } DATA { (0,0): 1, 4, 7, (1,0): 2, 5, 8, (2,0): 3, 6, 9 } ATTRIBUTE "MATLAB_class" { DATATYPE H5T_STRING { STRSIZE 6; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "double" } } } } }
-
And load it from python:
>>> import h5py >>> import numpy >>> f = h5py.File('x.mat') >>> x=f["x"] >>> x <HDF5 dataset "x": shape (3, 3), type "<f8"> >>> numpy.array(x) array([[ 1., 4., 7.], [ 2., 5., 8.], [ 3., 6., 9.]])
So it seems actually to be a good idea to use matlab's 7.3 format for interoperability.
Comments
-
- Yaroslav Halchenko (on November 22, 2009, 02:59:50)
although you might run into some other funny cases like: http://groups.google.com/group/h5py/browse_thread/thread/7f22dcac3165e04e :
Since scipy.loadmat puked that I should have used HDF reader, I've apt- get installed python-h5py and gave it a run. Unfortunately error message left me puzzled since TypeReferenceID is defined within h5py and I am ignorant about HDF thus not sure what H5T_REFERENCE reference is and why not to convert it to some Pythonic (or numpy-ic structure). Could it be worked-around? ;)
The H5T_REFERENCE type is essentially a pointer inside the HDF5 file. I'm not exactly sure why Matlab is using it, but the bad news is that this type is currently unsupported by h5py. There is initial support in SVN, and I do plan to support it in the next minor version (1.3), but that may be a while as I am trying to finish my thesis. :)
-
- Henrik Staun Poulsen (on March 2, 2012, 13:58:16)
So if I take a file that MatLab wrote (with the 7.3 format), will I then be able to use something like HDFView (from http://www.hdfgroup.org/hdf-java-html/hdfview/) to view the file?
<< So it seems actually to be a good idea to use matlab's << 7.3 format for interoperability.
If other programs can read the files too, then Yes. But...
-
- Soeren Sonnenburg (on March 5, 2012, 16:53:20)
Yes you can use hdfview, h5dump, h5ls etc.
-
- Henrik Staun Poulsen (on March 6, 2012, 14:52:09)
hi Soeren,
Thank you very much for your clarification. I went back and checked my code, and I found that I had to add v7.3 in order to get a .hdf5 file.
So, yes, it is indeed a brilliant format for interoperability.
Thank you for sharing.
Best regards, Henrik
Leave a comment
You must be logged in to post comments.
Unfortunately, v7.3 doesn't seem to support HDF5 compression yet or atleast so it seems. The .mat files stored in the 7.3 format are humongous compared to 7.0.