Drift Detection in CapyMOA#

In this tutorial, we show how to conduct drift detection using CapyMOA.

  • Usage example of several detectors.

  • Example using ADWIN.

  • Evaluating detectors based on known drift locations.

  • Multivariate drift detection using ABCD.


More information about CapyMOA can be found at https://www.capymoa.org.

last update on 28/11/2025

[1]:
import numpy as np
import pandas as pd

import capymoa.drift.detectors as detectors

1. Basic example#

  • Creating dummy data

[2]:
data_stream = np.random.randint(2, size=2000)
for i in range(999, 2000):
    data_stream[i] = np.random.randint(6, high=12)
  • Basic drift detection example

[3]:
all_detectors = detectors.__all__

n_detections = {k: 0 for k in all_detectors}
for detector_name in all_detectors:
    if detector_name == "STUDD":
        continue

    detector = getattr(detectors, detector_name)()

    for i in range(2000):
        detector.add_element(float(data_stream[i]))
        if detector.detected_change():
            n_detections[detector_name] += 1

print(pd.Series(n_detections))
ABCD                        1
ADWIN                       1
CUSUM                       2
DDM                         1
EWMAChart                   1
GeometricMovingAverage      1
HDDMAverage               135
HDDMWeighted               88
OPTWIN                      1
PageHinkley                 2
RDDM                        1
SEED                        2
STEPD                       1
STUDD                       0
dtype: int64

2. Example using ADWIN#

[4]:
# detector = ADWIN(delta=0.001)

for i in range(2000):
    detector.add_element(data_stream[i])
    if detector.detected_change():
        print(
            "Change detected in data: " + str(data_stream[i]) + " - at index: " + str(i)
        )
[5]:
# Detection indices
detector.detection_index
[5]:
[1000]
[6]:
# Warning indices
detector.warning_index
[6]:
[92,
 521,
 522,
 523,
 524,
 525,
 526,
 527,
 528,
 533,
 534,
 535,
 536,
 537,
 538,
 539,
 540,
 541,
 542,
 543,
 544,
 638,
 639,
 640,
 641,
 644,
 726,
 729,
 730,
 731]
[7]:
# Instance counter
detector.idx
[7]:
4000

3. Evaluating drift detectors#

Assuming the drift locations are known, you can evaluate detectors using EvaluateDetector class.

This class takes a parameter called max_delay, which is the maximum number of instances for which we consider a detector to have detected a change. After max_delay instances, we assume that the change is obvious and has been missed by the detector.

[8]:
from capymoa.drift.eval_detector import EvaluateDriftDetector

drift_eval = EvaluateDriftDetector(max_delay=200)

The EvaluateDetector class takes two arguments for evaluating detectors:

  • The locations of the drift

  • The locations of the detections

[9]:
trues = np.array([1000])
preds = detector.detection_index

drift_eval.calc_performance(trues, preds, tot_n_instances=detector.idx)
[9]:
DriftDetectionMetrics(fp=0, tp=1, fn=0, precision=1.0, recall=1.0, episode_recall=1.0, f1=1.0, mdt=np.float64(0.0), far=0.0, ar=0.25, n_episodes=1, n_alarms=1)

4. Multivariate drift detection#

[10]:
from capymoa.drift.detectors import ABCD
from capymoa.datasets import ElectricityTiny

detector = ABCD()

## Opening a file as a stream
stream = ElectricityTiny()
[11]:
i = 0
loss_values = []
while stream.has_more_instances() and i < 5000:
    i += 1
    instance = stream.next_instance()
    detector.add_element(instance)
    loss_values.append(detector.loss())
    if detector.detected_change():
        print("Change detected at index: " + str(i))
Change detected at index: 2776
[12]:
detector = ABCD(model_id="pca")

## Opening a file as a stream
stream_change = np.hstack(
    [np.random.uniform(0, 0.5, 3000), np.random.uniform(0.5, 1.0, 3000)]
)
stream_nochange = np.random.uniform(0, 1.0, len(stream_change))
stream = np.vstack([stream_change, stream_nochange]).T
print(f"A {stream.shape[-1]}-dimensional stream")
A 2-dimensional stream
[13]:
i = 0
loss_values = []
while i < len(stream):
    instance = stream[i]
    i += 1
    detector.add_element(instance)
    loss_values.append(detector.loss())
    if detector.detected_change():
        print("Change detected at index: " + str(i))
Change detected at index: 3060
[14]:
import matplotlib.pyplot as plt

plt.plot(pd.Series(loss_values).rolling(10).mean())
plt.title("ABCD with PCA")
plt.xlabel("# Instances")
plt.ylabel("Reconstruction loss")
[14]:
Text(0, 0.5, 'Reconstruction loss')
../_images/notebooks_drift_detection_21_1.png

We see that a value of 1 as maximum reconstruction error is very conservative. By decreasing the maximum_absolute_value parameter, we can make change detection faster as it makes the applied statistical test more sensitive.

[15]:
detector = ABCD(model_id="pca", maximum_absolute_value=0.3)

i = 0
loss_values = []
while i < len(stream):
    instance = stream[i]
    i += 1
    detector.add_element(instance)
    loss_values.append(detector.loss())
    if detector.detected_change():
        print("Change detected at index: " + str(i))
Change detected at index: 3023