Creating ImageDataBunch from MNIST data

node · April 19, 2019, 8:24am

I am working on the Kaggle Digit Recognizer and I am trying to get the .csv files converted into an ImageDataBunch.

csv containing 28x28 pixels so 784 columns.

I saw another post that mentioned I could download the image files from another source but I would much rather like to solve this issue as if those were not available. (in case I come accross this again later)

So far I have tried:
ImageDataBunch.from_df() -> tried this after importing as df

after which I receive the error ‘FileNotFoundError: [Errno 2] No such file or directory: ‘…/input/0’’

ImageDataBunch.from_csv() -> the example I found in the docs actually has links in the csv that point to .png files rather than having the data in the actual columnns

mcclomitz · April 19, 2019, 9:43am

Hey man - yeah this is tricky, I had a similar issue. I’d be interested to hear if anyone has a simple work around. Basically its a flattened version of the image so we need to reshape it. This kaggle kernel is a nice demonstration of how to do that. https://www.kaggle.com/goodbyte/minst-fastai

As an experiment I used the train.csv and ran it as if it were a tabular data set to see how it performs. It’s not great, but not bad and its fun practice. I could post a kernel about it if you, or anyone is interested.

Also fastai has a full mnist data set in .png format - so if you aren’t bothered about the competition and you want to get stuck in, that’s a bit easier to play with.

Good luck with it bud, let us know how you get on.