Statistical Techniques for Drift Detection - Learn Data Science with Travis

Statistical-Techniques-for-Drift-Detection

Statistical techniques for drift detection are essential tools in machine learning and data science, used to identify changes in data distributions over time. Data drift can significantly impact model performance, leading to inaccurate predictions if not addressed. Various statistical methods, such as the Kolmogorov-Smirnov (KS) test, Population Stability Index (PSI), and Kullback-Leibler divergence, are employed to monitor and quantify these changes. By applying these techniques, practitioners can ensure that their models remain robust and reliable, adapting to evolving data patterns and maintaining their effectiveness in real-world applications. Understanding these methods is crucial for effective model management and deployment.

Data Drift — Part 2: How to Detect Data Drift

Towards Data Science

A description of the Techniques to detect data drift. These include PSI, Kullback-Leibler (KL) divergence, (JS) Divergence, Wasserstein distance, PSI

Understanding Kolmogorov-Smirnov (KS) Tests for Data Drift on Profiled Data

Towards Data Science

Data drift meets data profiling Image by author TLDR: We experimented with statistical tests, Kolmogorov-Smirnov (KS) specifically, applied to full datasets as well as dataset profiles and compared r...

Measuring Embedding Drift

Towards Data Science

Approaches for measuring embedding/vector drift for unstructured data, including for computer vision and natural language processing models Image by author Data drift in unstructured data like images...

How to Detect Data Drift with Hypothesis Testing

Towards Data Science

Data drift is a concern to anyone with a machine learning model serving live predictions. The world changes, and as the consumers’ tastes or demographics shift, the model starts receiving feature…

How to Build a Fully Automated Data Drift Detection Pipeline

Towards Data Science

Motivation Data drift occurs when the distribution of input features in the production environment differs from the training data, leading to potential inaccuracies and decreased model performance. Im...

How to measure drift in ML embeddings

Towards Data Science

We evaluated five embedding drift detection methods Image by Author. Why monitor embeddings drift? When ML systems are in production, you often do not immediately get the ground truth labels. The mod...

SHAP for Drift Detection: Effective Data Shift Monitoring

Towards Data Science

Alerting Distribution Divercences using Model Knowledge Continue reading on Towards Data Science

How to detect, evaluate and visualize historical drifts in the data

Towards Data Science

TL;DR: You can look at historical drift in data to understand how your data changes and choose the monitoring thresholds. Here is an example with Evidently, Plotly, Mlflow, and some Python code. The…

How to Detect Concept Drift Without Labels

Towards Data Science

In a previous article , we explored the basics of concept drift. Concept drift occurs when the distribution of a dataset changes. This post continues to explore this topic. Here, you’ll learn how to d...

Understanding Concept Drift: A Simple Guide

Towards Data Science

Concept drift detection and adaptation is a key stage in the monitoring of AI-based systems. In this article, we’ll: Describe what concept drift is and how it arises in time-dependent data Explore ver...

Data Drift Explainability: Interpretable Shift Detection with NannyML

Towards Data Science

Alerting Meaningful Multivariate Drift and ensuring Data Quality Continue reading on Towards Data Science

Detecting and fixing data drift in Computer Vision

Towards Data Science

Practical case study with code that you can run Continue reading on Towards Data Science