I find the curve generated by learn.recorder.plot() can be very different depending on the value of num_it. For example, if I set num_it to be equal to an epoch then the lowest loss is at lr = 1e-2 and I may set lr = 1e-3 in fit_one_cycle(). On the other hand, if I use the default numt_it (100) then the loss does not even go up before the finding process ends and as a result I don’t know how I should choose my learning rate. Is there any suggestion on how to set the value of num_it?
bwangwp (Beinan Wang) #1