Hey all,
I was trying to submit a PR for QHAdam optimizer (in 12_optimizer.ipynb) but I noticed the last test for average_grad
is failing. I have been trying to fix it so that I can submit, but I’m blocked. Any help would be super appreciated!
This is the offending test:
…
test_eq(state[‘grad_avg’], (0.1 * 0.9+0.1) * p.grad)
However when I dig into it I can’t for the life of me figure out why its failing as the 2 tensors compared appear to be identical
AssertionError: ==:
tensor([[0.7600, 0.9500, 1.1400]])
tensor([[0.7600, 0.9500, 1.1400]])
I have traced it back to torch.equal()
but I have hit a brick wall. Manually checking that both tensors are equal returns True, as expected:
torch.equal(tensor([[0.7600, 0.9500, 1.1400]]),tensor([[0.7600, 0.9500, 1.1400]]))
True
To add to my confusion, the other tests work without any issues…
p = tst_param([1,2,3], [4,5,6])
state = {}
state = average_grad(state, p, mom=0.9)
test_eq(state[‘grad_avg’], p.grad)
state = average_grad(state, p, mom=0.9)
test_eq(state[‘grad_avg’], p.grad * 1.9)
#Test dampening
state = {}
state = average_grad(state, p, mom=0.9, dampening=True)
test_eq(state[‘grad_avg’], 0.1 * p.grad)
state = average_grad(state, p, mom=0.9, dampening=True)
test_eq(state[‘grad_avg’], (0.1 * 0.9+0.1) * p.grad)
Is this some hidden pytoch or python voodoo that is tripping me up?