I thought validation loss has a direct relationship with accuracy, means always lower validation loss causes higher accuracy, but while training a model, I faced this:
How is it possible? Why do we have lower validation loss but also lower accuracy?
It relates to the loss function. If we use mean square error (MSE) as a loss we optimize by reducing the [average squared] distance between our predictions and the true values - not by minimising miss-classification (Iām assuming this is classification). You may get intuition about this from drawing decision boundaries between classes in something like the iris data set (http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html). You may also see how you could move decision boundaries and still have the same accuracy but a wider margin between classes (how loss can improve and accuracy stay the same) - svm and boosting examples often show max margin. Playing with logistic regression with and without outliers in 2D may also help.
Jeremy mentioned an F2 loss function metric (I think thatās what was used - it was related to minimising false positives) - thereās a set of loss functions metrics around those lines that focus on classification accuracy measures.
Empirically, accuracy seems like quite a limited measure of quality of predictions. To predict whether an example belongs to some class, our model outputs a number (whatever we put through sigmoid or softmax) between 0 and 1.
To calculate accuracy, we take some arbitrary threshold (0.5 by default) and every prediction above means examples belong to some class and below they donāt. This threshold of 0.5 gets dicey really fast if we donāt have perfectly balanced class (50% positive and 50% negative examples) or if we have multiple classes.
What happens when we have 90% of negative examples and 10% of positive examples? Is 91% accuracy good or bad?
The best interpretation of accuracy goes up and loss goes up imho is: āour model is becoming better on doing well on accuracy with whatever threshold we setā.
Validation loss is nice as in some sense it is some measure of how much our predictions differ from what they should be before we put them through the threshold.
Iām getting a really weird sequence of numbers for validation loss vs. accuracy. While loss on train is getting smaller and smaller, loss of validation is fluctuating a lot. At the same time the quantity reported as āaccuracyā (which I still donāt know what it is), fluctuates in a small range. Iām training on a set of news articles and below are both the output of fastai builtin function and the output of performing the final classification on train, validation and test datasets calculated by the well-known scikitlearn classification_report function. As you see the accuracy at the very final epoch of training has been reported 0.567500, however, I really get good train,validation and test precision and recall on these datasets as well as the test set.