Accessing gradients during training

maw501 · November 7, 2018, 6:34pm

Hello all,

Hopefully a simple question but I’m after accessing the gradients of network in order to create charts similar to what Sylvain did here. I’d be happy to get the absolute value of the gradients for each layer of my network.

Note: I’m not after the weights via, say, something like learn.model.parameters() which I see often referenced on PyTorch forums when people ask for the gradients but rather the actual gradients for each mini-batch as we are training (I apologise in advance if I’ve misunderstood something).

Thanks,

Mark

sgugger · November 7, 2018, 6:54pm

For each one of your parameters p (in the iterator learn.model.parameters()) you can access their gradient in p.grad.
Note that this had to be done before the grads are zeroed, so if you’re in a Callback you’d want to check that in on_backward_end or on_step_end.