Requires_grad, freezing, and BatchNorm

Pomo · October 18, 2019, 11:36pm

I see that in fastai, when layer groups are frozen, the BatchNorm layers are carefully left with requires_grad True.

If I explicitly use the Pytorch function to set “requires_grad(layer,False)”, the BatchNorms are set with require_grad False. Is this going to affect training?

IOW, can I freely use “requires_grad(layer,…)” to freeze and unfreeze layers?

Thank you!

sgugger · October 19, 2019, 9:35pm

You can, but you should use Learner.freeze to do this. The BatchNorm layers are left in training mode on purpose (unless you use the BnFrezze callback) because in our experiments, it’s what worked best for transfer learning.