Hi All
Could anyone kindly explain to me what this code snippet does in the 004_callback notebook inside the Optimwrapper
for pg in self.opt.param_groups:
for p in pg['params']: p.data.mul_(1 - self._wd*pg['lr'])
Hi All
Could anyone kindly explain to me what this code snippet does in the 004_callback notebook inside the Optimwrapper
for pg in self.opt.param_groups:
for p in pg['params']: p.data.mul_(1 - self._wd*pg['lr'])
self.opt.param_groups
is a list containing all the parameters that the optimizer is working to improve (‘optimize’).
We are walking down these groups “pg” (parameter group) and grabbing their parameters by indexing into the dictionary with ‘params’.
“pg[‘params’]” returns a list and “p” represents a parameter module: Parameter — PyTorch 1.8.1 documentation
at this point, we have a parameter group and we access the the tensor of this parameter group with: p.data
p.data.mul_
is an inplace multiplication, which modifies the original tensor – so in this case, we are multiplying the weights of a specific parameter group by: (1 - weight_decay*learning_rate)
Hope that helps!
As a side note, 004_callback looks like a development notebook – you can find the libraries (current) notebooks in fastai/fastai/callback at master · fastai/fastai · GitHub
Thanks, I think this article will explain this expression for weight decay better in case of confusion for any other people