Use AUC in place of Accuracy as loss

simonjhb · March 18, 2019, 7:32pm

As I understand it, AUC cannot be used as a loss function in a Neural Network context for a couple of reasons.

The only thing that matters for an AUC score is the relative order of predictions in terms of probability - the metric is actually invariant under scaling. AUC is typically computed on the entire set of predictions - in other words on all training examples (or perhaps all validation examples). You could compute AUC on a mini-batch of examples, but I suspect you’d run into trouble with the differences that arise from the way mini-batches are sampled.

The more important issue is that the AUC function is not differentiable and consequently the back-propogation of errors into the network for gradient descent will not be possible. There have been various attempts to create differentiable approximations of the AUC metric to be used as a loss function with gradient descent - but they don’t seem to have gathered much momentum. I believe it is because they require changes to the gradient descent algorithm with respect to the way that training examples are fed in. See these two papers:
https://icml.cc/Conferences/2004/proceedings/papers/132.pdf and
http://www.icml-2011.org/papers/198_icmlpaper.pdf

If as @SBecker suggests you want to use AUC as an evaluation metric that is possible. There is some neat code here that @joshfp posted which uses a callback function to calculate the AUC score at the end of each epoch of training: