Accuracy metric

ishan · August 29, 2020, 3:26pm

In the mnist_basics, we make a choice to use 0 and 0.5 as a threshold in the accuracy metric. What is the motivation behind that? Is it arbitrary? Could we use any value as a threshold? In the sigmoid case I see that if the threshold were arbitrary it would still have to lie between 0 and 1.
Thank you for clarifying.

micstan · August 29, 2020, 7:54pm

@ishan it is not clear to me what do you mean by “we make a choice to use 0 and 0.5 as a threshold in the accuracy metric”, if there is a specific function in this chapter please share, it will be easier to help. Intuitively i guess you talk about the threshold for the sigmoid output that we select to predict one of the classes (if > 0.5 then 1, else 0) and then calculate accuracy based on this (batch_accuracy function in this chapter). Generally sigmoid output is interpreted as probability (although it is not that easy) so eg. if for a given MNIST number probability of being number 3 is > 0.5 then we assume it is 3. We choose the label with the most likely prediction!

By default it is always 0.5. In the real world applications it can happen that you move the threshold depending on the context of the problem you are solving eg. one of the classes you want to predict carries a greater risk for your company and you are willing to sacrifice the overall performance of the model in exchange for better sensitivity to this specific class. You may want to google recall vs precision tradeoff and ROC curve (not strictly DL related!)

ishan · August 29, 2020, 9:19pm

Thank you so much for your response Michal! I really appreciate it.
Your response makes total sense. The idea about 50% being the default for binary situation and moving the threshold based on toleration made things even more clear.
Thanks for the suggestion on precision vs recall and ROC curve.