Picking optimal learning rates?

xslipstream · October 6, 2019, 7:29pm

Hi all,

How do you all approach picking the optimal learning rates to fine-tune your models? I am under the impression I should be picking a range that has the deepest declining slope.

In my last few cells, I thought a range of 1e-4, 1e-2 or 1e-3, 1e-1 were of the steepest slopes but my results aren’t as good as 1e-6, 1e-4 (which looks flat in the curve to me).

V2_Question_on_LR_Lesson1.ipynb

Appreciate any thoughts

novarac23 · October 7, 2019, 1:12am

So in general yes you want to pick learning rate that has a deepest declining slope. When you unfreeze layers of a neural net and plot lr it becomes a bit harder to figure out what’s the steepest slope. If you pick for your second param in a slice to be something that’s 5-10 times smaller then the original learning and for your first number you find a spot on the graph before things go up and you pick roughly 5-10 times less then that you should be ok-ish. (for reference slice(first_number, second_number))

Now as Jeremy said picking the best learning rate is more of an art than science and the more you do it the more you can kinda guess what a best lr is. Also Jeremy explains more about learning rates and how they work later in course so that should help you as well.

Hopefully this makes sense!