How to load MNIST data on kaggle using fastAi V2

I want to load MNIST data using fastAi datablock API on kaggle. The data is in csv format and not the normal image files.

When I try to load it gives the following error:

Could not do one pass in your dataloader, there is something wrong in it

Here is my code:

train_csv = pd.read_csv(path/'train.csv')
test_csv = pd.read_csv(path/'test.csv')
datablock = DataBlock(get_x = lambda r : r.iloc[:1:],
               get_y = lambda r : r['label'])
dls = datablock.dataloaders(train_csv)

Other similar questions on the forum use fastAi V1 and I couldn’t find those functions in the documentation.

Also, most of the solutions involve first saving the files as image and then using the datablock API, however I think this will be slow, is there a way to directly use the csv file ?

Hi @voneone I just wrote a notebook that might help you here: https://www.kaggle.com/pemtaira/digit-recognizer-fastai-v2-2020

Let me know if any part of it is unclear. The solution I wrote does not save the file as images, but it does convert the tensor to PILImages in memory, and then just works with those.

As far as I can tell from FastAI’s code, if you want to use the vision library you have to somehow coerce your data into a PILImage.

1 Like