Several unwanted behaviors from ImageClassifierCleaner

Hi everyone,

I’d like to report two unwanted behaviors from ImageClassifierCleaner:

  • If you’d like to change an image (0022.jpg) to another category (from “man” to “woman” in my case), but there is already a file named 0022.jpg, the cleaner doesn’t create a new filename but throws back an error.

  • Once you’ve cleaned say “man” - “train”, click on “man” - “valid” and come back to “man” - “train” afterwise, iif you’ve deleted files this shows an error. A better behavior would probably be to show pictures that haven’t been validated yet ?

I’ll take a look at the source code and see if I can improve this, but it might take me a bit of time so in the meantime I wanted to alert you :wink:


Hello @bdubreu,
For the first problem, my approach is to have my file change code look like

for idx,cat in cleaner.change(): 
        shutil.move(str(cleaner.fns[idx]), path/cat)
    except Exception as e:
        originalfn = str(cleaner.fns[idx]).split('/')[-1] 
        ext = originalfn.split('.')[-1]
        newFn = originalfn[:-len(ext)-1]
        shutil.move(Path(str(path/cat)+'/' +str(originalfn)), str(Path(str(path/cat) + ('/'+ newFn + '_'+str(randint(0,10000))+'.'+ext))))
        shutil.move(str(cleaner.fns[idx]), path/cat)

Hi !

Thank you for your answer. Yep, I’ve also come up with similar workarounds.

For the second problem here is a workaround:

for idx in cleaner.delete():
    # figure out if the idx is in train (0) or valid (1) dataset
    involved_ds = 0 if Path(cleaner.fns[idx]) in cleaner.iwis[0].itemgot(0) else 1
    print('dataset:', ['train', 'valid'][involved_ds])
    # keep all other images
    mask = array(cleaner.iwis[involved_ds].itemgot(0)) != Path(cleaner.fns[idx])
    updated_ds = cleaner.iwis[involved_ds][mask]
    # change cleaner.iwis: needs to be a tuple with (updated train, valid) or (train, updated valid)
    cleaner.iwis = (updated_ds, cleaner.iwis[1]) if involved_ds == 0 else (cleaner.iwis[0], updated_ds)
    # update the learner's dataset:
    learn.dls[involved_ds].dataset = updated_ds
    # remove the pic both from the folder and from the fns
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)
1 Like

Another problem seems to be that when you change from cleaning one category (in my case black bears) to another (grizzly bears), the cleaner instance seems to lose all the changes you’d been flagging for black bears

1 Like

Yeah, I am having the same error. Did you solve that issue?

I didn’t solve it. I just did one category at a time.


1 Like

Ok Thank you

I had the same issue about moving images in the wrong categories and I solved it in a very silly way. I just created two empty folders inside each category folder and rename it as “moved”. Then I changed the code provided for moving images from:

for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat/)


for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat/"moved")

Then I just had to manually move the images from path/cat/“moved” to path/cat/.

Hope this could be helpful.