Getting Data With Image Classifier from CSV

Hello Everyone,

This is my first time posting so please let me know how I can improve on communicating the error.

I am looking at the fashion mnist dataset from kaggle. When I call learn.fit I get the following error: No such file or directory: data/fashionmnist/train/7. One of the other times I ran this I was also getting an error like No such file or directory: data/fashionmnist/train/9. I am not sure why it is looking for a number inside the train folder, when the only thing that is in there is a CSV called fashion-mnist_train.csv.

The step where I call ImageClassifierData is probably where the error is occuring.





Try looking at what is in data.
Try:
x, y = next(iter(data.val_dl))
And than look at their shapes.

1 Like

How do I do that?

Do you have the images in a folder?

All the image data is saved inside a folder called train

The file structure looks like

data/fashionmnist/train/fashion-mnist-train.csv

I guess there are no images in the folder, just the CSV file

the data inside the csv looks like

Also I tried running the code you posted earlier and I am getting the same kind of error. OSError: No such file or directory: data/fashionmnist/train/0

Do you think this might be because I used the ImageClassifierData method incorrectly? I don’t have a great understanding of how it works.

I think it is because of the ImageClassifierData.from_csv
Try looking at its docstring. It will tell you how to use it.
You can do that by typing ImageClassifierData.from_csv and SHIFT+TAB+TAB (Hold shift and double press tab.)
It should bring up how to use the function and its arguments.

1 Like

I looked at the structure of the data and I think the pixel values are in the csv file.

I’m not sure if i’m interpreting this correctly. Are you saying to create another folder called imgs, and to put actual pictures like .jpgs inside that folder? If that is the case how would I change the ImageClassifierData.from_csv

No, I was wrong the images are stored in the csv file.
ImageClassifierData.from_csv doesn’t read it in correctly therefore.
Now the question is how to load in the fashion mnist csv to create a DataLoader. Unfortunately, I don’t know the answer yet…

I found this https://medium.com/ml2vec/intro-to-pytorch-with-image-classification-on-a-fashion-clothes-dataset-e589682df0c5 It has a FashionMNISTDataset class. That should be close to what you want.
From that Dataset perhaps it would be easiest to create an ImageClassifierData with ImageClassifierData(path, datasets, bs, num_workers, classes)

For you:

fashionMNISTDataset = FashionMNISTDataset(f'{PATH}/train/fashion-mnist_train.csv')
data = ImageClassifierData(PATH, fashionMNISTDataset, 32, -1, None)

Than again try looking at x and y after x, y = next(iter(data.val_dl)).

Hi Hadus,
when I run x,y = next(iter(data.val_dl)) on my own dataset, it returns
OSError: No such file or directory.
I just run the planet competition notebook and it works fine. I tried to find the solution but didn’t find anything. can you suggest me how to solve this issue?
thank you

Make sure you have the data downloaded and in the right place.

OSError: No such file or directory means that it is trying to open a file
that doesn’t exist probably because you either don’t have the data or it is in the wrong place.

Hi Hadus,
thank you for your reply.
I have all my images in the train folder. however, when i run x,y = next(iter(data.val_dl)) the OSError: No such file or directory returns a file name that is not in my label_csv file.
i made a post in the forum about this issue: X,y = next(iter(data.val_dl)) returns no such file error; with more details
appreciate for any suggestion.