Challenges while creating your own dataset

You could try the 100m image flickr dataset http://yfcc100m.appspot.com/ . Or search google images for different date ranges to get different image sets, and pause between searches to avoid getting throttled.

Do a nearest neighbors on the penultimate layer activations perhaps?

4 Likes