Hi @a_bhimany_u, I could not get your solution to work, please see below. As far as I understood, the problem is that mom[i] is a CPU variable while the model has been loaded to CUDA.

```
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-37-36d1490bd9f3> in <module>
----> 1 losses = [update(x,y,lr) for x,y in data.train_dl]
<ipython-input-37-36d1490bd9f3> in <listcomp>(.0)
----> 1 losses = [update(x,y,lr) for x,y in data.train_dl]
<ipython-input-36-5e0bf8c84769> in update(x, y, lr, wd, beta1, beta2, epsilon)
18 with torch.no_grad():
19 for p in model.parameters():
---> 20 mom[i] = beta1*mom[i] + (1-beta1)*p.grad
21 rms[i] = beta2*rms[i] + (1-beta2)*(p.grad2)
22 p.sub_(lr * (mom[i]/((rms[i] + epsilon)**0.5)))
RuntimeError: expected type torch.FloatTensor but got torch.cuda.FloatTensor
```

Could you please share your working solution?

Thanks