Drift Detection in CapyMOA#

In this tutorial, we show how to conduct drift detection using CapyMOA

  • Then test different drift detectors

  • Example using ADWIN

  • Evaluating detectors based on known drift location

More information about CapyMOA can be found in https://www.capymoa.org

last update on 25/07/2024

import numpy as np
import pandas as pd

import capymoa.drift.detectors as detectors

Basic example#

  • Creating dummy data

data_stream = np.random.randint(2, size=2000)
for i in range(999, 2000):
    data_stream[i] = np.random.randint(6, high=12)
  • Basic drift detection example

all_detectors = detectors.__all__

n_detections = {k: 0 for k in all_detectors}
for detector_name in all_detectors:

    detector = getattr(detectors, detector_name)()

    for i in range(2000):
        if detector.detected_change():
            n_detections[detector_name] += 1

ADWIN                       1
CUSUM                       2
DDM                         1
EWMAChart                   1
GeometricMovingAverage      1
HDDMAverage               154
HDDMWeighted               92
PageHinkley                 2
RDDM                        1
SEED                        3
STEPD                       1
dtype: int64

Example using ADWIN#

from capymoa.drift.detectors import ADWIN

detector = ADWIN(delta=0.001)

for i in range(2000):
    if detector.detected_change():
        print('Change detected in data: ' + str(data_stream[i]) + ' - at index: ' + str(i))

Change detected in data: 10 - at index: 1023
# Detection indices
# Warning indices
# Instance counter

Evaluating drift detectors#

Assuming the drift locations are known, you can evaluate detectors using EvaluateDetector class

This class takes a parameter called max_delay, which is the maximum number of instances for which we consider a detector to have detected a change. After max_delay instances, we assume that the change is obvious and have been missed by the detector.

from capymoa.drift.eval_detector import EvaluateDetector
eval = EvaluateDetector(max_delay=200)

The EvaluateDetector class takes two arguments for evaluating detectors: - The locations of the drift - The locations of the detections

trues = np.array([1000])
preds = detector.detection_index

eval.calc_performance(preds, trues)
mean_time_to_detect           24.0
missed_detection_ratio         0.0
mean_time_btw_false_alarms     NaN
no_alarms_per_episode          0.0
dtype: float64