When to use sliced learning rates for frozen models

ilovescience · April 3, 2019, 4:36am

Sometimes I see people use both sliced learning rate (slice(lr)) and a single learning rate when working with frozen models. While using sliced learning rate for unfrozen models make sense, what does it mean to use sliced learning rates for frozen models? In addition, when should one use sliced learning rates for frozen models as opposed to a single value for the learning rate? Does one way lead to consistently better results than the other?

ste · April 3, 2019, 5:47am

From docs:

If you pass just slice(end) then the last group’s learning rate is end , and all the other groups are end/10 . For instance (for our learner that has 3 layer groups):

https://docs.fast.ai/basic_train.html#Learner.lr_range

rohit_gr · April 3, 2019, 8:34am

I don’t think it will change anything. Different learning rates are assigned to different layer groups, so if all the layer groups are freezed except the last one then using lr or slice(lr) won’t make any difference.