Lesson 2 - How to clean up data more efficiently


In lesson 2, an image cleaner is presented which presents a user interface to see images and review if they are misclassified or simply wrong.

During my project, I created a large dataset of around 33K images for around 1100 birds from India.
What is the best way to clean up this data, manually going through the images may not be a great idea.



I’m just starting myself, now between lesson 1 and 2, and found the results from duckduckgo search to be insufficient for automatic learning, there are too many wrong results (for training!). I plan to write a crawler and manually inspect the results.
I think a good image viewer which allows quick erasing of images is the way to go to get rid of unwanted training data. If manual inspection is no option you’ll have to use another, existing classifier, I guess.



Thanks, Seb.
Yes, google search provides better quality results. There is python library as well for it. I haven’t tried it though.
As you suggest , perhaps writing another classifier with (Bird / Not a bird) would be a solution.