Validation Loss VS Accuracy

Empirically, accuracy seems like quite a limited measure of quality of predictions. To predict whether an example belongs to some class, our model outputs a number (whatever we put through sigmoid or softmax) between 0 and 1.

To calculate accuracy, we take some arbitrary threshold (0.5 by default) and every prediction above means examples belong to some class and below they don’t. This threshold of 0.5 gets dicey really fast if we don’t have perfectly balanced class (50% positive and 50% negative examples) or if we have multiple classes.

What happens when we have 90% of negative examples and 10% of positive examples? Is 91% accuracy good or bad?

The best interpretation of accuracy goes up and loss goes up imho is: ‘our model is becoming better on doing well on accuracy with whatever threshold we set’.

There are other metrics that take performance of our classifier at different thresholds into consideration, for example area under ROC curve or mean average precision

Validation loss is nice as in some sense it is some measure of how much our predictions differ from what they should be before we put them through the threshold.

2 Likes