What does the term ‘data leakage’ mean in a machine learning context?

Data Science with Python Medium

Data Science with Python — Medium

What does the term ‘data leakage’ mean in a machine learning context?

Key points

  • Data leakage can lead to unrealistic performance estimates.
  • It occurs when information from the test set influences training.
  • Models affected by data leakage may not generalize well to new data.

Ready to go further?

Related questions