Hey all, new user here. I am trying to load a dataset from the 2015 APTOS challenge from Kaggle.
Another user has provided the dataset. The image labels in the .csv do not exactly match the file names, but they are ordered. I have been rummaging around the data_block API but I think something must be going over my head.
Images are in /images. Labels are in labels/trainlabels.csv
I would like to load the images from folder, and apply the labels from the csv by index. Is this possible? Is it a bad idea? Should I structure the data some other way?
I don’t understand why you would want to label by index. What is a typical image filename and a typical label in the .csv ? I am sure there is a simple way to match the id in your label file to the image id.
Yes in your place I would probably try to save a new csv with the right filenames, it would probably be the easiest. Is it always 100 preceding the filename or is there at least any logic so that you can automate it within a dataframe ?