True Weight Decay in OptimWrapper

True weight decay (mulitplying weights by 1 - wd*lr in OptimWrapper) must be applied to only the trainable parameters i.e. when we are training the head (body frozen), weight decay is not applied to the body parameters. Is this correct?

If only trainable parameters get true weight decay where is this enforced in fastai (sorry I got lost when digging through the source code)

Hi Prabu,
1 - If the body is frozen, then the only weight decay would be to the head section, not the body.

2 - Re: where enforced - check the OptimWrapper class:

and the and freeze/unfreeze:

My understanding is the optimizer wrapper handles in order to do it (or not do it) before the regular PyTorch Adam optimizer is called. wd is for the optim wrapper, and weight_decay is what gets passed to PyTorch (only one can be set of course, not both).

See this thread for more info:

Hope that helps,

Thanks for the pointers and the info Less. I will dig into this more. I also thought what you had mentioned. My first dive into this before posting the query on the forums seemed to suggest (most likely my misunderstanding ) that true_wd was being applied to the body as well. I was using the oxford-iiit lesson-01 ipynb in part1-v3 (where the head is being trained with ResNet34). Now, I am sure that this a misunderstanding on my part. I will dive again.
Thanks again for your reply.