I wrote a small ebook about applying validation techniques to different types of real-world datasets. Going into short examples of how different data types have to be treated to avoid overfitting.
I touch on the topics of:
- train-test splits
- spatial validation
- temporal validation
- models data drift
This is a mini e-book as a reference guide for those that need quick insight to get an overview of the different pitfalls in real-worl machine learning.
Mini e-book describing proper validation of machine learning models in real-world data regimes and how to keep them working in production settings.