Lesson 2: further discussion ✅

marcmuc · October 31, 2018, 3:44am

I have a question for after class regarding the “delete photos from dataset” concept (new widget) introduced:

In which cases does it make sense to do delete images that “don’t belong”?
In which cases is it better to create a new “other” category in order for the network to be able to discern between the actual classes and random bullshit (“none of the above”)?

Especially in “real-world” multiclass settings involving real people you will always get those (people uploading hotdog photos to the cat/dog classifier app etc.)?

I have wondered this e.g. in the google quickdraw dataset. There is no “none” /“other”/“random” category, although clearly a lot of times people just doodle random stuff not belonging to any of the 340/345 categories. Would it not be helpful to distinguish this instead of predicting one of the existing known classes? Or would this hinder the network from learning the actual classes?

Is it better to train only on the correct categories and then have a mechanism that based on very low probabilities across categories will say “none of the above”? (Isn’t this difficult when using softmax, because that will still give you some “winner” category most of the time)