Two Types of Drift
- Data Drift: Input feature distribution changes (P(X) shifts)
- Concept Drift: The relationship between features and target changes (P(Y|X) shifts)
Data drift is easier to detect. Concept drift requires labeled data.
Detection with Evidently AI
from evidently.test_suite import TestSuite
from evidently.tests import TestColumnDrift
test_suite = TestSuite(tests=[
TestColumnDrift(column_name='TransactionAmt'),
TestColumnDrift(column_name='card1'),
])
test_suite.run(reference_data=ref_df, current_data=cur_df)
Statistical Tests
- KS Test: Continuous features
- Chi-squared: Categorical features
- PSI (Population Stability Index): Both — PSI > 0.2 = critical drift
Alerting Strategy
- PSI > 0.1: Warning (monitor closely)
- PSI > 0.2: Alert (schedule retraining)
- AUC drop > 3%: Emergency retrain