Why use loss vs lr and not accuracy vs lr in lr_find?

In Leslie N. Smith’s paper, all the discussions were focused around using accuracy v/s learning_rate. But fastai uses loss v/s lr. Any reasons for it? Or is it just to reduce the computation?

Is there any way in which I can use accuracy in the plot?

1 Like

I don’t know if this is the reason but accuracy is only a useful metric for certain tasks (such as classification but not for regression, for example), while you will always have a loss regardless of the task.

One reason I can think of is that accuracy breaks results down to a binary decision based on your accuracy threshold. Loss on the other hand reflects how confident the model is in its decisions.

Consider binary classification on two examples with ground truth values (0, 1). One model classifies the examples (0, 1) with probabilities (0.4, 0.6). The other classifies the examples (0, 1) with probabilities (0.02, 0.98). These models would have the same accuracy based on a threshold of 0.5, but the second model would have a much lower loss.

1 Like

I don’t think you can use accuracy in lr_find out of the box. However, writing a callback for this purpose should be really simple.

I’m more confused by why you would want to do this in the first place. You are going to be optimizing your loss function, not your accuracy. You want a high lr, without so high that your loss diverges. That’s what lr_find lets you do.

If your question is why not use accuracy (as opposed to say BCE) as loss:

  1. Accuracy is not a smooth function, has no derivatives
  2. As @KarlH said, it doesn’t give you probabilities.
1 Like

It makes sense to use to use loss instead of accuracy.