Getting Data With Image Classifier from CSV


#1

Hello Everyone,

This is my first time posting so please let me know how I can improve on communicating the error.

I am looking at the fashion mnist dataset from kaggle. When I call learn.fit I get the following error: No such file or directory: data/fashionmnist/train/7. One of the other times I ran this I was also getting an error like No such file or directory: data/fashionmnist/train/9. I am not sure why it is looking for a number inside the train folder, when the only thing that is in there is a CSV called fashion-mnist_train.csv.

The step where I call ImageClassifierData is probably where the error is occuring.






(Martin) #2

Try looking at what is in data.
Try:
x, y = next(iter(data.val_dl))
And than look at their shapes.


#3

How do I do that?


(Martin) #4

Do you have the images in a folder?


#5

All the image data is saved inside a folder called train

The file structure looks like

data/fashionmnist/train/fashion-mnist-train.csv

I guess there are no images in the folder, just the CSV file

the data inside the csv looks like

Also I tried running the code you posted earlier and I am getting the same kind of error. OSError: No such file or directory: data/fashionmnist/train/0

Do you think this might be because I used the ImageClassifierData method incorrectly? I don’t have a great understanding of how it works.


(Martin) #6

I think it is because of the ImageClassifierData.from_csv
Try looking at its docstring. It will tell you how to use it.
You can do that by typing ImageClassifierData.from_csv and SHIFT+TAB+TAB (Hold shift and double press tab.)
It should bring up how to use the function and its arguments.


(Martin) #8

I looked at the structure of the data and I think the pixel values are in the csv file.


#9

I’m not sure if i’m interpreting this correctly. Are you saying to create another folder called imgs, and to put actual pictures like .jpgs inside that folder? If that is the case how would I change the ImageClassifierData.from_csv


(Martin) #10

No, I was wrong the images are stored in the csv file.
ImageClassifierData.from_csv doesn’t read it in correctly therefore.
Now the question is how to load in the fashion mnist csv to create a DataLoader. Unfortunately, I don’t know the answer yet…


(Martin) #13

I found this https://medium.com/ml2vec/intro-to-pytorch-with-image-classification-on-a-fashion-clothes-dataset-e589682df0c5 It has a FashionMNISTDataset class. That should be close to what you want.
From that Dataset perhaps it would be easiest to create an ImageClassifierData with ImageClassifierData(path, datasets, bs, num_workers, classes)

For you:

fashionMNISTDataset = FashionMNISTDataset(f'{PATH}/train/fashion-mnist_train.csv')
data = ImageClassifierData(PATH, fashionMNISTDataset, 32, -1, None)

Than again try looking at x and y after x, y = next(iter(data.val_dl)).