ReduceLROnPlateau and fit_one_cycle/fit_sgdr

Hi all,
When doing learn.fit_one_cycle, it uses a param scheduler to adjust the learning rate as it goes.
At some point it reaches a plateau, so I tried using ReduceLROnPlateau to reduce the learning rate. But it only changes the ‘lr’ hyper parameter, not the LR policy schedulers, and hence has no effect.

Any ideas how to address this and enable reducing the LR inside the scheduler in a generic way? Anything already planned in this area?

Few questions regarding this:

  1. Does it matter to wait until the end of a cycle to determine there is a plateau?
  2. Is it worth just reducing the main ‘lr_max’ parameter, or are there others worth changing?
  3. To pass the change to scheduler, should it be recreated, or wrap the sched parameters, or something else?
  4. lr_find() simulates self.fit() with different values. If it’s to be used to find an optimal LR for fit_one_cycle/fit_sgdr/others, should it use those instead?
  5. What is actually the best value from LRFinder to pass as input to lr / lr_max initially?

Thanks! :smiley:

I am also facing same issue. Have you figured out way to address this.

Same problem for me also. It seems that fit_one_cycle and ReduceLROnPlateau should be designed so that they work together. After all, I do this manually all the time, and would like to automate it.

:slightly_smiling_face: