Why use loss vs lr and not accuracy vs lr in lr_find?

kushaj · March 11, 2019, 1:20pm

In Leslie N. Smith’s paper, all the discussions were focused around using accuracy v/s learning_rate. But fastai uses loss v/s lr. Any reasons for it? Or is it just to reduce the computation?

Is there any way in which I can use accuracy in the plot?

machinethink · March 11, 2019, 5:15pm

I don’t know if this is the reason but accuracy is only a useful metric for certain tasks (such as classification but not for regression, for example), while you will always have a loss regardless of the task.

KarlH · March 11, 2019, 6:56pm

One reason I can think of is that accuracy breaks results down to a binary decision based on your accuracy threshold. Loss on the other hand reflects how confident the model is in its decisions.

Consider binary classification on two examples with ground truth values (0, 1). One model classifies the examples (0, 1) with probabilities (0.4, 0.6). The other classifies the examples (0, 1) with probabilities (0.02, 0.98). These models would have the same accuracy based on a threshold of 0.5, but the second model would have a much lower loss.

Krisztian · March 12, 2019, 4:06pm

I don’t think you can use accuracy in lr_find out of the box. However, writing a callback for this purpose should be really simple.

I’m more confused by why you would want to do this in the first place. You are going to be optimizing your loss function, not your accuracy. You want a high lr, without so high that your loss diverges. That’s what lr_find lets you do.

If your question is why not use accuracy (as opposed to say BCE) as loss:

Accuracy is not a smooth function, has no derivatives
As @KarlH said, it doesn’t give you probabilities.

kushaj · March 12, 2019, 8:03pm

It makes sense to use to use loss instead of accuracy.