Training and validation loss both decrease but accuracy doesn't increase

In my dogbreed notebook, I’m seeing this:

[ 0.       0.27302  0.21015  0.93597]                        
[ 1.       0.25252  0.2081   0.9306 ]                        
[ 2.       0.24493  0.20816  0.93551]                        
[ 3.       0.22577  0.20581  0.93451]                        
[ 4.       0.22319  0.20776  0.93395] 

Between epoch 0. and 1., both the training loss decreased (.273 -> .210) and the validation loss decreased (0.210 -> 0.208), yet the overall accuracy decreased from 0.935 -> 0.930.

I would definitely expect it to increase if both losses are decreasing. Am I misunderstanding something about how the accuracy is calculated, or does this look like a rounding error?

I think this kind of fluctuations are normal, that is related on how the loss and accuracy are calculated, accuracy just takes in consideration what you got right no matter “how much right” you got, (e.g. in dogs vs cats, it doesn’t matter if your network predicts a cat with 51% certain or 99%, for accuracy this have the same meaning ‘cat’), but the loss function do take in consideration “how much right” is your prediction.

So in your example, maybe your network predicted less images right, but the ones it got right it got “more right” haha, sorry if this feels confusing, feel free to ask :grinning:


Sometimes it helps to look at another metric in addition to loss and accuracy. One other popular/useful metric for binary classification is to check the AUC (Area under Curve). In particular if you have an inbalanced dataset, you could have a very misleading accuracy for example if you had 90% of one class and 10% of another, just by guessing everything is the majority class you have 90% accuracy yet you have a classifier that is not useful. So its important to look at the balance between true positives and false positives.

Its pretty easy to use this metric, see below code:

from sklearn import metrics
roc_auc = metrics.roc_auc_score

roc_auc(y, pred)

Is there a way to optimize for AUC as a loss function for columnar neural network training?

These are the built-in loss criteria I see:

I’m working on an old Kaggle competition that judges on AUC for predicted click through rate (I’m currently using binary cross entropy for the loss function)

Hi @xtermz
did you find a way to optimize AUC in the loss function? I am going through a competetion with AUC metrics.

@shakur Unfortunately I didn’t. From what I understand, AUC can’t be optimized directly because it isn’t differentiable. I ended up sticking to Binary Cross Entropy for my competition specifically.

1 Like

Hi! I’m having a similar (but bigger) problem. When I train my object detection model it originally predicts every pixel as a positive. After a few images, it stops predicting any pixels as positive. I end up with large TN and FN values and 0 for TP and FP. Do you have any idea why this would happen? I’m using a unet architecture and I’m using MSE as my loss function. I have tried changing my optimizer, learning rate, and loss function with no success.