p.data.add_(-lr * min(r1/r2,10) * step)
i have tried with : p.data.add_(-lr * r1/(r2+eps) * step)
it worked well on mnist if not better
p.data.add_(-lr * min(r1/r2,10) * step)
i have tried with : p.data.add_(-lr * r1/(r2+eps) * step)
it worked well on mnist if not better