How can I create fine_tune checkpoints to resume from?

florianl · October 23, 2020, 2:23pm

In fastai1 there was a parameter start_epoch for fit_one_cycle but it’s not there anymore in fastai2. But I found the following thread:

if you started with fine_tune you have to keep in mind, that fine_tune runs fit_one_cycle twice with different parameters. so you need to run it with the same parameters again to continue your training.

from the fastai code:

def fine_tune(self:Learner, epochs, base_lr=2e-3, freeze_epochs=1, lr_mult=100,
              pct_start=0.3, div=5.0, **kwargs):
    "Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
    self.freeze()
    self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    base_lr /= 2
    self.unfreeze()
    self.fit_one_cycle(epochs, slice(base_lr/lr_mult, base_lr), pct_start=pct_start, div=div, **kwargs)

So I would do this to resume e.g. from epoch 10 and train 5 additional epochs:

class SkipToEpoch(Callback):
    def __init__(self,s_epoch): self.s_epoch = s_epoch
    def begin_train(self):  if self.epoch < self.s_epoch: raise CancelEpochException
    def begin_validate(self):  if self.epoch < self.s_epoch: raise CancelValidException

learn = ....
learn.load('checkpoint.pkl')

start_epoch=10
total_epochs=15
cbs=[SkipToEpoch(s_epoch=start_epoch)]

base_lr = <your LR>
base_lr /= 2
lr_mult = 100 # from fine_tune
# parameters pct_start and div taken from fine_tune()
learn.unfreeze()
learn.fit_one_cycle(total_epochs, slice(base_lr/lr_mult, base_lr), pct_start=0.3, div=5.0, cbs=cbs)

Hope that helps.

Florian