I am creating ImageList that has around 40k+ categories and noticed that the lr_find has completely flattened out suggesting a gradient of 2.00E-11. While I have done a great job going to accuracy of 30% from 1% I seem to have hit a wall and can’t get accuracy to improve much more.
When the curve is so flat, does it even matter which lr is selected? Should I keep it closer to the 1e-04 or keep doing a slice around where it suggests?
The other parameter im tinkering with is WD and have moved it from .4 to .00001 but neither seems to help much.
In the case of what you suggested, I usually find the inflection point and move back a factor of 10. So in your case, it would probably be something like 1e-4.
Try learn.recorder.plot(skip_end=5). skip_end is an integer which will skip the loss values from the end of the plot so that you can find a steep curve. You can try bigger values for skip_end during the plot without doing lr_find again and again. Also isn’t start_lr=1e-12 too small to start with?