One thing I continue to struggle with after some time of working through the examples is the exact mapping of the parameters
cut,lr_cut = model_meta[arch]
I understand so far, that cut
is the number of modules in the pretrained models corresponding to the feature extractor (as opposed to the classifier part). This is where you cut the model - OK.
However, ‘lr_cut’ is less clear to me. I understand it is used for applying differential learning rates or for freezing and unfreezing the weights. But the details are unclear to me.
How exactly does it map to e.g. three differential learning rates?
E.g. for arch = resnet18
you get cut, lr_cut = 8, 6
. So, how does 6 map to three differential learning rates?
Thanks for any hint.