Natural Gradient Descent in fast.ai

I want to implement natural gradient descent for fast.ai. For this, I need to calculate a Fisher Information Matrix of the weights really fast and for that I need to calculate the square of the gradients of each output w.r.t the input. The problem is that PyTorch autograd computes and returns the sum of gradients of outputs w.r.t. the inputs.

Is it possible to change that in PyTorch? If not, would it be possible in Swift4Tensorflow?