Frouros is a Python library for drift detection in Machine Learning problems.
Frouros provides a combination of classical and more recent algorithms for drift detection, both for detecting concept and data drift.
As a quick and easy example, we can generate two normal distributions in order to use a data drift detector like Kolmogorov-Smirnov. This method tries to verify if generated samples come from the same distribution or not. If they come from different distributions, it means that there is data drift.
import numpy as np
from frouros.data_drift.batch import KSTest
np.random.seed(31)
# X samples from a normal distribution with mean=2 and std=2
x_mean = 2
x_std = 2
# Y samples a normal distribution with mean=1 and std=2
y_mean = 1
y_std = 2
num_samples = 10000
X_ref = np.random.normal(x_mean, x_std, num_samples)
X_test = np.random.normal(y_mean, y_std, num_samples)
alpha = 0.01 # significance level for the hypothesis test
detector = KSTest()
detector.fit(X=X_ref)
statistic, p_value = detector.compare(X=X_test)
p_value < alpha
>>> True # Drift detected. We can reject H0, so both samples come from different distributions.
More examples can be found here.
Frouros supports Python 3.8, 3.9 and 3.10 versions. It can be installed via pip:
pip install frouros
Latest main branch modifications can be installed via:
pip install git+https://github.com/IFCA/frouros.git
The currently supported methods are listed in the following table. They are divided in three main categories depending on the type of drift that they are capable of detecting and how they detect it.
Some well-known datasets and synthetic generators are provided and listed in the following table.
Type | Dataset |
---|---|
Real
|
Elec2
|
Synthetic
|
SEA
|