Open Thoughts

Matlab(tm) 7.3 file format is actually hdf5 and can be read from other languages like python

Posted by Soeren Sonnenburg on November 19, 2009

It looks like that matlab version 7.3 and later are capable of writing out objects in the so called matlab 7.3 file format. While at first glance it looks like another proprietary format - it seems to be in fact the Hierarchical Data Format version 5 or in short hdf5.

So you can do all sorts of neat things:

  1. Lets create some matrix in matlab first and save it:

    >> x=[[1,2,3];[4,5,6];[7,8,9]]
    x =
     1     2     3
     4     5     6
     7     8     9
    >> save -v7.3 x.mat x
  2. Lets investigate that file from the shell:

    $ h5ls x.mat 
    x                        Dataset {3, 3}
    $ h5dump x.mat 
    HDF5 "x.mat" {
     GROUP "/" {
      DATASET "x" {
        DATASPACE  SIMPLE { ( 3, 3 ) / ( 3, 3 ) }
        DATA {
        (0,0): 1, 4, 7,
        (1,0): 2, 5, 8,
        (2,0): 3, 6, 9
        ATTRIBUTE "MATLAB_class" {
           DATATYPE  H5T_STRING {
                 STRSIZE 6;
                 STRPAD H5T_STR_NULLTERM;
                 CSET H5T_CSET_ASCII;
                 CTYPE H5T_C_S1;
           DATA {
           (0): "double"
  3. And load it from python:

    >>> import h5py
    >>> import numpy
    >>> f = h5py.File('x.mat')
    >>> x=f["x"]
    >>> x
    <HDF5 dataset "x": shape (3, 3), type "<f8">
    >>> numpy.array(x)
    array([[ 1.,  4.,  7.],
       [ 2.,  5.,  8.],
       [ 3.,  6.,  9.]])

So it seems actually to be a good idea to use matlab's 7.3 format for interoperability.


Anshul Kundaje (on November 21, 2009, 07:46:20)

Unfortunately, v7.3 doesn't seem to support HDF5 compression yet or atleast so it seems. The .mat files stored in the 7.3 format are humongous compared to 7.0.

Yaroslav Halchenko (on November 22, 2009, 02:59:50)

although you might run into some other funny cases like: :

Since scipy.loadmat puked that I should have used HDF reader, I've apt- get installed python-h5py and gave it a run. Unfortunately error message left me puzzled since TypeReferenceID is defined within h5py and I am ignorant about HDF thus not sure what H5T_REFERENCE reference is and why not to convert it to some Pythonic (or numpy-ic structure). Could it be worked-around? ;)

The H5T_REFERENCE type is essentially a pointer inside the HDF5 file. I'm not exactly sure why Matlab is using it, but the bad news is that this type is currently unsupported by h5py. There is initial support in SVN, and I do plan to support it in the next minor version (1.3), but that may be a while as I am trying to finish my thesis. :)

Henrik Staun Poulsen (on March 2, 2012, 13:58:16)

So if I take a file that MatLab wrote (with the 7.3 format), will I then be able to use something like HDFView (from to view the file?

<< So it seems actually to be a good idea to use matlab's << 7.3 format for interoperability.

If other programs can read the files too, then Yes. But...

Soeren Sonnenburg (on March 5, 2012, 16:53:20)

Yes you can use hdfview, h5dump, h5ls etc.

Henrik Staun Poulsen (on March 6, 2012, 14:52:09)

hi Soeren,

Thank you very much for your clarification. I went back and checked my code, and I found that I had to add v7.3 in order to get a .hdf5 file.

So, yes, it is indeed a brilliant format for interoperability.

Thank you for sharing.

Best regards, Henrik

Leave a comment

You must be logged in to post comments.