all the datasets / competitions I have seen so far have definite categories. There are cats/dogs, 120 dog breeds, etc. But for real life applications there is a good chance that during production you have to classify images where none of your categories is applicable. For instance, in @jeremy 's dogs vs. cats example a picture of a fence was classified as dog.
This is an unsatisfying situation. So my question is: How to handle this kind of non-matching situations? I can think of two ways:
1.) Based on the model’s probability, return a “not sure” if prob < 0.9 (or other limit)
2.) Train a specific category that is orthogonal to your real categories.
In the latter case, how would such an “anything else” category look like?
“The final composition of our dataset was 150k images, of which only 3k were hotdogs: there are only so many hotdogs you can look at, but there are many not hotdogs to look at. The 49:1 imbalance was dealt with by saying a Keras class weight of 49:1 in favor of hotdogs. Of the remaining 147k images, most were of food, with just 3k photos of non-food items, to help the network generalize a bit more and not get tricked into seeing a hotdog if presented with an image of a human in a red outfit.”
In an extreme case, you could create a “Other” category which comprises all the 1000 categories of Imagenet (or 999 if what you are trying to classify is already a class in Imagenet).