Interpretation of multiple learning rates

When you guys set:

and say: " the first few layers will be at 1e-4, the middle layers at 1e-3, and our FC layers we’ll leave at 1e-2 as before"

I don’t understand why do you refer resnet34 model as like it has only 3 layers… resnet34 has many layers, let’s assume it has 20 layers, then - what layers receive lr of 1e-4, what receive 1e-3 and 1e-2?

and how can I check\display the layer and its learning rate?


These different learning rates are applied to groups of layers, not individual layers. So ResNet34 is divided up into 3 groups and each group gets its own learning rate.