Is there a rule of thumb for picking discriminative learning rates? What would you pick here? From the lessons I remember that for a single learning rate, you could pick the steepest point with a negative slope before the minimum. Is there a similar trick for a range of learning rates?
I also noticed that lr_find() gives SuggestedLRs, is there a canonical way to use these in the next training? How do we translate
lr_steep to a range of learning rates?