Dealing with .mat files

What is the best way to deal with .mat files with fastai? For example this image depth segmentation dataset

I’ve come across a lot of datasets that are .mat files but haven’t seen a way of importing them with the library in the docs. I’ve managed to import it using:

import numpy as np
import h5py

f = h5py.File('nyu_depth_v2_labeled.mat', 'r')
print(f.keys())
imageData = np.array(f.get('images'))
print(imageData.shape)
depthData = np.array(f.get('depths'))
print(depthData.shape)

As far as I know .mat is a format built for MATLAB. If i remember those correctly, you can basically put in anything you want into that (functions, arrays, variables, all with arbitrary naming). Meaning there is not the one way to store for example images in that.
Thus, there is not best way to deal with them directly I guess. The only thing you can do is read them, figure out what kind of data is in there, and save it in a more convenient way to use it with fastai.

1 Like

I’m using data stored as .mat for image segmentation like this:

import scipy.io as sio

def open_mat(fn, *args, **kwargs):
    data = sio.loadmat(fn)
    data = np.array([data[r] for r in ['band1', 'band2', 'band3']])
    data = torch.from_numpy(data).float()
    return Image(data)

def open_mask(fn, *args, **kwargs):
    data = sio.loadmat(fn)['mask']
    data = torch.from_numpy(data).float()
    return Image(data.view(-1, data.size()[0], data.size()[1]))

class SegLabelListCustom(SegmentationLabelList):
    def open(self, fn): return open_mask(fn, div=True)
    
class SegItemListCustom(ImageList):
    _label_cls = SegLabelListCustom
    def open(self, fn): return open_mat(fn)

I hope this helps :slight_smile:

Edit: In my dataset I have one sample for each .mat file, if you have multiple images in the same file maybe the easiest way is to save them separately in another format as suggested above.

Is the Image class your using from fastai?

Yes, it is :slight_smile: I created the open_mat based on fastai open_image.