Let’s say I call fit_one_cycle with 100 epochs, does the learning rate and momentum cycles take into account the # of epochs I specify here? Or does the actual cycle is for ONE epoch?
I am wondering about the best way to use fit_one_cycle in combination with EarlyStoppingCallback.
If one cycle == one epoch then this is fine. But let’s say I setup my learner with EarlyStoppingCallback like so:
learn.callback_fns +=[partial(EarlyStoppingCallback, monitor='accuracy', min_delta=0.005, patience=3)]
Then I want to fit my learner during the night and make it stop when it doesn’t improve anymore. So I call fit_one_cycle with 100 epoch hoping it will trigger EarlyStoppingCallback before reaching the 100th epoch. But if the cycle take into account the # of epochs I guess this is not optimal.
You can use warm_start =0.1, which will increase the learning rate for the first 10% of time, and spent 90% of time to decrease it with annealing. I’m not sure if it’s the correct way to do it, but I think it helps you to avoid stopping when increasing learning rate.
can someone explain why EarlyStoppingCallback, doesn’t work as expected?
here is my code
epoch = 25
learn = cnn_learner(data, arch,pretrained=True,
metrics=[accuracy, error_rate, top_k_accuracy],
callback_fns=[partial(CSVLogger, filename = aName)
partial(SaveModelCallback, monitor ='val_loss', mode ='auto',name = mName ),
partial(EarlyStoppingCallback, monitor='val_loss', min_delta=0.001, patience=5)
learn.fit_one_cycle(epoch, max_lr=maxLR, moms =[0.95, 0.85], div_factor = 25.0)
and this is what I get in the output log file:
Valid_loss is clearly decreasing more than min_delta= 0.001 in each epoch but regardless it has stopped my training.
I think when your tracked variable (in your case val_loss) doesn’t improve more than your min_delta OR becomes worse than it was the epoch before, then patience is decreased by 1. When patience equals 0, the training stops.
Thanks for your reply @etremblay,
it seems that if in the source code of fastai, we knock out this line:
*if self.operator == np.less: self.min_delta = -1
it will stop the training only after 5 (patience) consecutive worsen validation_loss.
Sorry I didn’t really look at the code, but explained how it worked from what I observed while using it :).