Image cleaner doesn't actually delete the junk images?

I had to retrain my model because the export.pkl file stopped working for some reason.

I went back and the library of images I had taken some time to clean had the junk images back in it and the model it trained was much worse.

Is this normal and is there an image cleaner that will actually permanently delete the photos when I tell it to delete them?

Mine will delete them but it won’t reclassify images like the fastai one and it doesn’t work with top losses, you just have to look through them all and delete the stuff you don’t want.

If that works for you:

pip install jmd_imagescraper

from jmd_imagescraper.imagecleaner import *

display_image_cleaner(path)

And at some point I really must add that functionality to mine.

1 Like

Btw, I just took a quick look at the fast.ai one.

Their image cleaner doesn’t delete the images, it returns a list of images which have been flagged by the user. You need to actually do something with them (ie: delete them) yourself. I guess the rational is that you may want to move them elsewhere or do something else with them.

If you didn’t realise this then the good news is that your model would have been better all along since you never actually deleted anything.

1 Like

Hi all,

May I ask how do you retrain after running the Image Cleaning?

Am I right to say that after I select the images to delete or change, I just rerun the following code?

dls = alcohols.dataloaders(path)
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(4)

Thank you

To delete or move the selected images, uncomment an run one of the following blocks (from the course notebook):

# for idx in cleaner.delete(): cleaner.fns[idx].unlink()
# for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

After that, you can reload the data and retrain or finetune.

3 Likes

For me,
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)
causes an error, it is trying to move file 00000135.jpg into the grizzly folder from the black folder, but there is already one there. Has anyone else had this problem?

3 Likes

yes me too, not sure how to solve it though

So I guess it lacks error checking for moving image categories that were already moved previously :frowning:
So I found this on another forum post.
After the cleaner the ‘path’ changes so rerun this code to give you a new learner

dls = bears.dataloaders(path)
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)

From there maybe go back and do more cleaning if you want to. I was getting a lot of issues with deleted images in my cleaning step so presumably running this code after some cleaning will tidy things up