Proper project setup for multi-class classification

So I decided to create a computer vision app inspired by both the transfer-learning approach taught in the first lecture and HBO’s Silicon Valley show.

The Idea is to clone the Seefood App (classify between a hotdog or not a hotdog) and augment it with the ability to recognize other types of food ie: taco and lobster

I have two plans to proceed with the model. And I am not sure what is the better approach.

Approach 1:

  • The model is trained on N classes of food (to be recognized) plus an unknown category with random image object not belonging to any of N classes.

  • Last softmax layer should output:

  • [ P(hotdog),P(taco),P(lobster), … a bunch of food …, P(unknown)]

  • The model returns the category with the highest probability

Approach 2:

  • The model is trained on N classes of food to be classified only.

  • The last softmax layer outputs:

  • [ P(hotdog),P(taco),P(lobster), … a bunch of food …]

  • We set a threshold probability.

  • If none of the class achieves a probability above the threshold, then the model returns the unknown label.

  • If one or more class achieves a probability above the threshold, then we return the class with the highest probability.

Questions:

  1. What is the better approach?
  2. If the first approach is better, How do I pick the ideal data distribution for the unknown category’s training data? (Do I just grab non-food picture from random category and dump those in the unknown training data ?)
  3. If the second approach is better, How should I decide what is the proper threshold probability?

All inputs are appreciated. Thanks !