Is lr_find modifying the weights of the model?

bmetge · February 8, 2019, 9:53am

Is calling lr_find affecting the weights of the model ?
Since many notebooks exemples seem to reload the model’s state as it was before calling lr_find and then only train it further.

Thanks

balnazzar · February 8, 2019, 11:30am

The lr finder saves the current weights as it starts. Then it does serveral training attempts and takes note of the various responses in terms of loss (this obviously alters the weights). Finally, it reloads the weights it saved as first thing.

(@bmetge: note that in English they say “modifying”. It would be better to correct the title)

tritemio · February 8, 2019, 12:07pm

On a relate note, is lr_find() resetting the weights for each learning rate?

Intuitively it should, so that losses across learning rates can be meaningful compared. But from a quick look at the code it seems that the optimization keeps running for different learning rates without resetting the weights in between. The weights are reset only at the end:

github.com

fastai/fastai/blob/1.0.42/fastai/callbacks/lr_finder.py#L9


"Tools to help find the optimal learning rate for training"
from ..torch_core import *
from ..basic_data import DataBunch
from ..callback import *
from ..basic_train import Learner, LearnerCallback


__all__ = ['LRFinder']


class LRFinder(LearnerCallback):
"Causes `learn` to go on a mock training from `start_lr` to `end_lr` for `num_it` iterations."
def __init__(self, learn:Learner, start_lr:float=1e-7, end_lr:float=10, num_it:int=100, stop_div:bool=True):
    super().__init__(learn)
    self.data,self.stop_div = learn.data,stop_div
    self.sched = Stepper((start_lr, end_lr), num_it, annealing_exp)
    #To avoid validating if the train_dl has less than num_it batches, we put aside the valid_dl and remove it
    #during the call to fit.
    self.valid_dl = learn.data.valid_dl
    self.data.valid_dl = None

Am I missing something? Is this how it is supposed to work?

balnazzar · February 8, 2019, 1:56pm

Looking at the code I suppose you are right. Better to wait for a more informed answer, though.

(I agree with you when you say they should be reset at each lr increment step to obtain a more meaningful response)

PiotrCh · December 9, 2020, 2:34am

Because it starts from very low learning rate and perform only few (100) iterations,
statistically it probably doesn’t matter if you reset or not weights.
Resetting weights would make this process longer.