Interesting paper. https://arxiv.org/pdf/2003.02218
I have been trying it, and it is showing interesting results… the jury is still out on the generalization, BUT, its not that I am getting good generalization anyways with other optimizers in my line of work.