Advice on interaction between (max) learning rate, n epochs and cycle length?

Hello :wave:t3: ,

According to the results, it looks like you are underfitting as your train loss is greater than your validation loss.

Also, I am not sure about the choice regarding the learning rates (LR), they look very low compared to what I have seen throughout the first lessons. With the x-axis cut off in some charts, it’s hard to tell if those LR were the best ones.

  • I would try running the model with the default settings with the LR at 3e-3 (I think that’s what it is set at), then make some plots and fine-tune the LR with lr_find()
  • Also, check out Jeremy’s methodology in Lesson 3 notebooks (here is the link to the notes).

Concerning max_lr=slice(start, end), I think it means something like, train the first layers with a LR of start; the last layers at a LR of end; and for the remaining layers, spread the LR across the range (start, end).

2 Likes