I’ve created my validation data by using valid_pct=0.2 in ImageDataBunch.from_folder
.
After using DatasetFormatter
/ ImageCleaner
as explained in the docs I have noticed that my dataset now is missing the validation set. So overall my dataset shrunk by 20 percent and thus yields lower accuracy (on kaggles mnist competition).
I have read the documentation and source code and still can’t figure out how to retain the validation data or add it back after cleaning.
I could try to add it manually but I have a hunch I’m doing something wrong.