Improved training of Wasserstein GANs

Hi everyone!

A new paper that look promising came out today on the Arxiv.

This article is about improving the training process of WGAN. After multiple tests, the authors noticed that the main instabilities in the training process of WGANs were due to the weight clipping. In the original WGAN article, weight clipping is used to ensure that the theorical results can be applied (it is a Lipschitz constraint). In this new paper, the authors propose an alternative way to doing it (penalizing the norm of the gradient of the critic with respect to its input) and it appears to be working a lot better.

Happy reading!


Also easier to add to standard approaches, I’d guess.

It makes intuitive sense to me that a penalty with a nice smooth scaling would be better than a binary threshold on weights. I remember thinking it seemed like an odd way to do it when I read the paper, but never really thought about it further…

Great find! Thanks for sharing.

Cool! They also put code on github

Excuse me, what does it mean by ‘smooth scaling’ in this context ? Is it something like Lagrange multiplier ?