Base_lr vs lr_max

HaoTieu · August 11, 2021, 1:01pm

What is different between base_lr in fine_tune and lr_max in fit_one_cycle.

kBodolai · August 12, 2021, 8:38am

Hi Hao,

You can see how these two relate in the source code:

def fine_tune(self:Learner, epochs, base_lr=2e-3, freeze_epochs=1, lr_mult=100,
              pct_start=0.3, div=5.0, **kwargs):
    "Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
    self.freeze()
    self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    base_lr /= 2
    self.unfreeze()
    self.fit_one_cycle(epochs, slice(base_lr/lr_mult, base_lr), pct_start=pct_start, div=div, **kwargs)

fine_tune calls fit_one_cycle twice, using a slice object the second time so the lr_max is smaller (base_lr/lr_mult) for the first layer, and gradually increases to base_lr for the last layer.

lr_max is the maximum learning rate of the curve drawn by fit_one_cycle.

Hope this is clear, fastai’s source code is very readable, if not, let me know and we can explore a bit further.

K.

HaoTieu · August 12, 2021, 12:55pm

Thanks a lot.