WGAN-GP huge gradient/loss magnitudes

I’ve implemented a minimal WGAN-GP on MNIST (code here).it kinda works and outputs some digits but loss/gradient magnitudes are so huge, for example:

  • GradientPenalty \approx 10^{19}

  • D(\hat{x})- D(x) \approx -10^{10}

  • -D(\hat{x}) \approx 10^{10}

There is definitely something wrong with these numbers, but I’ve got no clues, have you any ideas?