Difficulty setting learning rate

I’m trying the dog breed identification challenge, and when I try to set the learning rate based on the graph, it seems as if choosing a learning rate with a higher loss results in greater accuracy. I’m setting my size to sz=299 and architecture to arch=resnext101

Here are some examples of weird occurrences when setting my learning rates after calling lrf=learn.lr_find() and learn.sched.plot() after learn = ConvLearner.pretrained(arch, data, precompute=True):

Here’s me choosing 0.1, as it seems like the optimal learning rate based on how the professor said we should choose learning rate.

Here’s the accuracy with 0.01 for the same dataset, arch, and size. 0.01 technically has a higher loss, but results in a more accurate model.

And here’s with 0.3, which technically should be better than 0.1, but is much worse.

Can anyone help me understand what’s going on, or if I’m doing something incorrectly? Thanks!

Take a look at some of the earlier threads discussing lr_find. I think you’re misinterpreting how it is working. I wouldn’t expect 0.1 to provide a better result initially - only if you use cycle_len and train for quite a bit longer.

2 Likes

From reading through several threads discussing lr_find, it looks like it gives predictions based on the current state of the model. So from what I understand, I should specify a cycle length and after training the first 3 epochs, call lr_find again to find the best training rate to go with at this point, and then unfreeze and fit some more? I’m going to review the third lesson today, as I feel like I’m missing an understanding of a lot of specifics. Thanks professor for your reply!