What do I do after data cleaning?

IR1DO · November 5, 2023, 2:54pm

I’m done cleaning up my data use cleaner = ImageClassifierCleaner(learn). And then the book says “Once we’ve cleaned up our data, we can retrain our model.” so where do I start retraining, do I need to recreate the Dataloaders, or do I just need to run "learn = vision_learner(dls, resnet101, metrics=accuracy)
learn.fine_tune(7) "?

vbakshi · November 5, 2023, 9:18pm

My understanding is that the cleaner steps edits or removes image file paths from the path folder, so once cleaning is complete, you create a new DataLoaders with that (now cleaned and updated) path and then train the model. Following the steps in 02_production.ipynb:


# load the cleaner and select images to delete/move
cleaner = ImageClassifierCleaner(learn)
cleaner

# delete or move images as selected in the `cleaner`
for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

# create a new `DataLoaders` object using the updated `path` and train the model
dls = bears.dataloaders(path)
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)