I’m trying the dog breed identification challenge, and when I try to set the learning rate based on the graph, it seems as if choosing a learning rate with a higher loss results in greater accuracy. I’m setting my size to sz=299 and architecture to arch=resnext101
Here are some examples of weird occurrences when setting my learning rates after calling lrf=learn.lr_find() and learn.sched.plot() after learn = ConvLearner.pretrained(arch, data, precompute=True):
Here’s me choosing 0.1, as it seems like the optimal learning rate based on how the professor said we should choose learning rate.
Here’s the accuracy with 0.01 for the same dataset, arch, and size. 0.01 technically has a higher loss, but results in a more accurate model.
And here’s with 0.3, which technically should be better than 0.1, but is much worse.
Can anyone help me understand what’s going on, or if I’m doing something incorrectly? Thanks!