Unsupervised Clustering of images of the same class

Hi all,

In Part 1 Lesson 2:
Downloading images from google images was presented to us for creating your own data sets…

I’ve created a data set for classifying bird species in Africa and there’s literally 1004 classes…So basically the images downloaded are accurate and all but 1000 classes are too much to clean up manually and you need to look for specific features of the bird then.

So I thought okay so write a code for cleaning up each folder individually comparing their similarity or image clustering whatever the case, but I want a way to let the desired program know what this bird looks like and delete all the images that thus aren’t supposed to be there(incorrectly downloaded).

this could probably even be added to the fast.ai library of @jeremy for when you’ve already downloaded the images.

Such a program or references to sites would be really helpful. Is there code that I can write(even keywords to search)? Please help :stuck_out_tongue:
Thanks!