I’m working on the dog breed kaggle competition which has a folder for test, and train, and a csv which lists the training set images and their actual dog breed. I ran the code below with an empty validation folder and got a validation loss. how was this validation loss calculated? thanks
ImageClassifierData.from_csv does not require a valid folder, it takes a split of the CSV. You can control how much with get_cv_idxs function. val_idxs is by default set to 20% of rows of csv for validation set
Just to clarify more what @sjdlloyd said. I think you misunderstand the validation set and test set. Validation set is the data that you already have the label and it is taken from your train folder. The interval of validation set is defined by val_idxs.
Test set is the data you want to apply your model on (to submit to Kaggle for example). It doesn’t have the label, that’s why you don’t see the .csv file for testset. Then, it doesn’t matter how you move the data from your training folder to the test folder, I think you will get almost the same val_loss