I am trying to make my own model based on the one in Lesson 1, with all the training and validation folders & file structure.
What I’m confused about is: do I have to label the folders in the train directory?
For example, if I have images of bears and penguins, do I create bear and penguin folders and store the relevant images in them? Or do I that just for my valid folder?
Hi, I have a follow-up question:
in ImageClassifierData.from_csv, I can fill in the name of training folder and test folder. what about validation folder? does val_idxs mean the validation images needs to be in the PATH folder?
thank you!
Hi Anna,
So with from _csv the train and validation folder are the same (the folder variable as you can see below), and val_idxs controls how this is split.
def from_csv(cls, path, folder, csv_fname, bs=64, tfms=(None,None),
val_idxs=None, suffix=’’, test_name=None, continuous=False, skip_header=True, num_workers=8):
So you don’t even have to worry about the validation set, as this split is taken for you. Does that help?
When I download data from kaggle sometimes they provide a separate folder for validation set, if I understand you correctly, I would have to put validation dataset into the training data folder so from_csv can read it throw val_idxs?
Hey Anna,
You actually don’t have to put validation data in the training data folder. You can have a separate folder for your validation set, like we did in dogs-vs-cats. Or you can use a number of other methods used in the course notebooks, if your data doesn’t have a separate validation set. I hope that answers your question.