What is the Wasserstein distance (Earth Mover’s Distance) and why is it preferred over Jensen-Shannon divergence for GANs?

Machine Learning Hard

Machine Learning — Hard

What is the Wasserstein distance (Earth Mover’s Distance) and why is it preferred over Jensen-Shannon divergence for GANs?

Key points

  • Wasserstein distance calculates transport cost between distributions
  • Provides meaningful gradients for GAN training
  • Jensen-Shannon divergence saturates with non-overlapping support
  • Wasserstein distance enables stable GAN training

Ready to go further?

Related questions