How to make differential learning rate training work with ParamScheduler?

sirgarfield · April 25, 2021, 3:39pm

I can use differential learning rate by passing a slice to the optimizer like this

opt_func = partial(Adam, lr=slice(lr / 100, lr), wd=0.01, eps=1e-8)

learn = Learner(
        dls, model, opt_func=opt_func
)

How should I make differential learning rate training work if I am already use ParamScheduler call back ?

For example

cos_sched = {'lr': SchedCos(1e-3, 1e-5) }

learn = Learner(
        dls, model, opt_func=opt_func,
        cbs=[ParamScheduler(cos_sched)] 
)

What I would like to do is to construct ParamScheduler like below

cos_sched = {'lr': SchedCos( slice(1e-3/100, 1e-3), slice(1e-5/100, 1e-5))}'
ParamScheduler(cos_sched)

But this throws run time error. What is the correct API call ?