I am building a classifier using the Food-101 dataset to tell apart between different foods. The dataset is 101,000 images big with 75750 being uncleaned, labelled training set and 25250 being clean, labelled test dataset. I am currently using the test dataset for validation as per the recommendation of another user since the test dataset is already labelled but I want to train my model and validate it using 20% of the training set, save the trained model and load it up against the test set.
Currently, this is how I’m using my data:
np.random.seed(42) data = ImageList.from_folder(path).split_by_folder(train='train', valid='test').label_from_re(pat=file_parse).transform(size=224).databunch()