Hello
,
According to the results, it looks like you are underfitting as your train loss is greater than your validation loss.
Also, I am not sure about the choice regarding the learning rates (LR), they look very low compared to what I have seen throughout the first lessons. With the x-axis cut off in some charts, it’s hard to tell if those LR were the best ones.
- I would try running the model with the default settings with the LR at
3e-3(I think that’s what it is set at), then make some plots and fine-tune the LR withlr_find() - Also, check out Jeremy’s methodology in Lesson 3 notebooks (here is the link to the notes).
Concerning max_lr=slice(start, end), I think it means something like, train the first layers with a LR of start; the last layers at a LR of end; and for the remaining layers, spread the LR across the range (start, end).