StreamRHF#
- class capymoa.anomaly.StreamRHF[source]#
Bases:
AnomalyDetector
StreamRHF anomaly detector
StreamRHF: Streaming Random Histogram Forest for Anomaly Detection
StreamRHF is an unsupervised anomaly detection algorithm tailored for real-time data streams. Building upon the principles of Random Histogram Forests (RHF), this algorithm extends its capabilities to handle dynamic data streams efficiently. StreamRHF combines the power of tree-based partitioning with kurtosis-driven feature selection to detect anomalies in a resource-constrained streaming environment.
Reference:
Example:
>>> from capymoa.datasets import ElectricityTiny >>> from capymoa.anomaly import StreamRHF >>> from capymoa.evaluation import AnomalyDetectionEvaluator >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = StreamRHF(schema=schema, num_trees=5, max_height=3) >>> evaluator = AnomalyDetectionEvaluator(schema) >>> while stream.has_more_instances(): ... instance = stream.next_instance() ... proba = learner.score_instance(instance) ... evaluator.update(instance.y_index, proba) ... learner.train(instance) >>> auc = evaluator.auc() >>> print(f"AUC: {auc:.2f}") AUC: 0.73
- __init__(schema, max_height=5, num_trees=100, window_size=20, random_seed=0)[source]#
Initialize the StreamRHF learner. :param schema: Schema of the data stream. :param max_height: Maximum height of the trees. :param num_trees: Number of trees in the forest. :param window_size: Size of the sliding window. :param random_seed: Random seed for reproducibility.
- predict(instance)[source]#
Predict anomaly score for a single instance. This method uses the anomaly score of the instance to classify it as normal (0) or anomalous (1) based on a threshold. :param instance: An instance from the stream. :return: 0 if the instance is classified as normal, 1 if classified as anomalous.