Choosing a learning rate when lr_find is flattening out

I am creating ImageList that has around 40k+ categories and noticed that the lr_find has completely flattened out suggesting a gradient of 2.00E-11. While I have done a great job going to accuracy of 30% from 1% I seem to have hit a wall and can’t get accuracy to improve much more.

I did notice that reducing batch size and going from fp16 to fp32 did give a little more wiggle room. But after 4 epochs accuracy only got worse.

When the curve is so flat, does it even matter which lr is selected? Should I keep it closer to the 1e-04 or keep doing a slice around where it suggests?

The other parameter im tinkering with is WD and have moved it from .4 to .00001 but neither seems to help much.


In the case of what you suggested, I usually find the inflection point and move back a factor of 10. So in your case, it would probably be something like 1e-4.

You could also try and use the Automated Learning Rate Suggester. My guess is that it would suggest something around 1e-4

Try learn.recorder.plot(skip_end=5). skip_end is an integer which will skip the loss values from the end of the plot so that you can find a steep curve. You can try bigger values for skip_end during the plot without doing lr_find again and again. Also isn’t start_lr=1e-12 too small to start with?

It was falling off the edge!