I have trained a Neural Network and when I look at the validation data(top losses) I can find images that doesn’t have any corresponding images in the train set. So what I want to do is dump a number of images into my train folders(similar but not same as the validation image). But when I create the new databunch I want all the old images split in the same way.
Picture: A(train set), B(validation set)
Add picture C and D,
new DB, A(still in train set), B(still in validation set), C(randomly moved to validation set), D(randomly moved to train set).
Also if I delete a few bad images it should not affect how the images are placed when creating the DB. For example
A(train), B(valid), C(train), D(valid)
A(still in train), B(still in valid), C(deleted), D(still in train).
The reason for this topic is that when I restart my Notebook, I want to avoid cross contamination of images from the test set ending up in the validation set when I train more.
I thinking of creating a known split CSV file or similar, does fast.ai have this type of feature?
Filename A: Valid
Filename B: Test
Filename C: Train