Learn.lr_find() gives different value

Why learn.lr_find() followed by learn.recorder.plot() gives different value each time when running even if I don’t run the learn.fit_one_cycle again? Please help.

3 Likes

Short answer. The data that you are passing is different for different runs due to various data augmentations used.

Below I show the process that I used to find this thing. (Most things were not necessary but the below process is a good way of debugging things).

So there are two functions are mostly responsible for this learn.lr_find() or learn.recoder.plot(). The first step is to make sure that the state_dict of the model remains same. To do so I used the below code.

learn.model.state_dict()

learn.lr_find()
learn.model.state_dict()

learn.recorder.plot()
learn.model.state_dict()

The state_dict was same in all the cases. Now the next step is to check the internal state that we pass to learn.recorder.plot(). The idea is if the internal states (more specifically the losses) are same, then the randomness is due to learn.recorder.plot() else it is due to learn.lr_find().

The reason I use losses is learn.recorder.plot() only manipulates the losses and there is no way to introduce randomness. I use the below code to check for the loss values.

learn.lr_find()
learn.recorder.losses

learn.lr_find()
learn.recorder.losses

As it turns out the loss values were different in the above two cases. So now we know the randomness is introduced in the learn.lr_find(). Next step is to check the source code of it. The source code is

start_lr = learn.lr_range(start_lr)
start_lr = np.array(start_lr) if is_listy(start_lr) else start_lr
end_lr = learn.lr_range(end_lr)
end_lr = np.array(end_lr) if is_listy(end_lr) else end_lr
cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
epochs = int(np.ceil(num_it/len(learn.data.train_dl)))
learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)

start_lr and end_lr are same for all the cases. So the problem must be in LRFinder. Now if we see the source code of LRFinder, we see the following lines.

class LRFinder(LearnerCallback):
    "Causes `learn` to go on a mock training from `start_lr` to `end_lr` for `num_it` iterations."
    def __init__(self, learn:Learner, start_lr:float=1e-7, end_lr:float=10, num_it:int=100, stop_div:bool=True):
        super().__init__(learn)
        self.data,self.stop_div = learn.data,stop_div
        self.sched = Scheduler((start_lr, end_lr), num_it, annealing_exp)

Here we found the problem. self.data is not same for different iterations (due to various data augmentations used). Hence different loss values and thus different graphs.

8 Likes

If I run learn.lr_find() many times before learn.fit_one_cycle(). Will it affect my final result?
Also, will it affect accuracy of the model?

No it won’t, as it’s ‘mock’ training. So anything happening to those weights are wiped the moment lr_find() is done

1 Like

As save.data is not same in LRFinder for every iterations and gives different graphs. So, I’ll have different best learning rate(lr) points for every time I run lr_find() command. Then which lr should I choose?

but i set no augmentation, i still got different plots, can u please help me out of the problem? Thank in advance!

Is there any follow up for this question ?

I was going to open another topic but I’ve realized this conversation exists. In my case lr_find() gives different lr_valley and I think it affects the error_rate. I’m passing exactly the same data, I need to tune epoch number to get the same result. Can anybody confirm this or is it just my delusion?