Choosing a learning rate when lr_find is flattening out

I am creating ImageList that has around 40k+ categories and noticed that the lr_find has completely flattened out suggesting a gradient of 2.00E-11. While I have done a great job going to accuracy of 30% from 1% I seem to have hit a wall and can’t get accuracy to improve much more.

I did notice that reducing batch size and going from fp16 to fp32 did give a little more wiggle room. But after 4 epochs accuracy only got worse.

When the curve is so flat, does it even matter which lr is selected? Should I keep it closer to the 1e-04 or keep doing a slice around where it suggests?

The other parameter im tinkering with is WD and have moved it from .4 to .00001 but neither seems to help much.

2 Likes

In the case of what you suggested, I usually find the inflection point and move back a factor of 10. So in your case, it would probably be something like 1e-4.

You could also try and use the Automated Learning Rate Suggester. My guess is that it would suggest something around 1e-4

1 Like

Try learn.recorder.plot(skip_end=5). skip_end is an integer which will skip the loss values from the end of the plot so that you can find a steep curve. You can try bigger values for skip_end during the plot without doing lr_find again and again. Also isn’t start_lr=1e-12 too small to start with?

It was falling off the edge!