Labels from folder, validation from csv?

Hi, I have a bunch of images that I just want to do basic classification on. The images are in separate folders according to label (as in Lesson 1 of the course). Those folders are contained in a parent folder. However, now I’d like to tell the model which images to use as a validation set. I have a csv file (also in the parent folder) with names of all the images that I wish to use as a validation set (the list contains a bunch of file names from each of the categories). I can’t find a way to do get labels from folders and the validation split from a csv (or txt or whatever). The documentation doesn’t mention much about the .split_from_whatever() methods (even though they are used in Lesson 3 of the course). The data block API thing is completely incomprehensible to me and I don’t understand what goes with what on that webpage…

Here is what I’m trying:

data = (ItemList.from_folder(path)
.split_by_fname_file(path + ‘valid.csv’)
.label_from_folder()
.databunch(bs=bs)
.normalize(imagenet_stats))

I really need to do this for work. Please use layman language, I’m a complete noob.

When all things fail just create a csv file with all the filenames and a column with valid=True for your validation images. I am suggesting you to use split_by_idxs and from_csv methods.

Thank you for your answer. Can I please just clarify: how do I then indicate the labels? Should I still split my images into folders by label or do I need to say what labels they are in the csv file somehow? And if the latter, how do I store these labels? Thanks for your time!

The label would be the second column of the csv table. Look at the docs. No you don’t need to split your images. If the third column has valid=True then that image would be included in the validation set and otherwise not.