Machine Learning — Hard
Key points
- Wasserstein distance calculates transport cost between distributions
- Provides meaningful gradients for GAN training
- Jensen-Shannon divergence saturates with non-overlapping support
- Wasserstein distance enables stable GAN training
Ready to go further?
Related questions
