Part way through lesson 2, Jeremy runs learn.fit_one_cycle(2, max_lr=slice(3e-5,3e-4)) however, how does he know 2 cycles is enough? how does he know if he runs another epoch, the loss will not get better?
Also, as an aside, when I run interp.plot_confusion_matrix() when there are a lot of classes (20), I can’t read a thing. Does anyone have any advice? Perhaps it is possible to run the matrix on a subset of the learner?
I think he just wanted to demonstrate how it works. If you think the model can improve with more cycles, just run it. I remember one time he said with fast.ai, it is quite difficult to overfit so feel free experiment.
in confusion matrix, check the highest numbers that are not matched correctly(they are not on the diagonal)
then you can see what pairs are being confused. High numbers will even be accentuated.
for the cycles, experiment… You can save the model, and try more cycles, and if it doesn’t work, load it and try something else. With transfer learning, training is so fast that you are not wasting a lot of time.
Quick tip, while the model is training, go on stack overflow and read some questions and answers, browse the forums, or read documentation