Have a question for the implementation of Adam:
Thanks,
The next debias function is used for debias2 as well and over there it doesn’t cancel out so it makes sense to keep the more general function.
debias
debias2
in debias2 call: mom = sqr_mom damp = 1-sqr_mom So it is same as debias1 call, damp and 1-mom still cancells off. you may have a look.