One of the issues I had when I put my model into production (a simple web app) was the handling of images that are not in the training/validation set which are very different from the images the model was trained on.
I accidentally did this when I mixed my Teddy Bear Detector and fed it images from the pet breed
database and cats were being classified as teddy bears .
So I tried to fix this by putting thresholds on the probability (e.g. if probability of the predicted class
was less than 85%, give up and say “I don’t know”.
I was wondering if there are any well-known approaches on handling this issue of how confident we are of the model’s prediction and deciding when to say “I’m not sure” or “I don’t know”.
I would love to get some pointers to deeper studies regarding this issue.