OnlineIsolationForest#
- class capymoa.anomaly.OnlineIsolationForest[source]#
Bases:
AnomalyDetector
Online Isolation Forest
This class implements the Online Isolation Forest (oIFOR) algorithm, which is an ensemble anomaly detector capable of adapting to concept drift.
Reference:
Example:
>>> from capymoa.datasets import ElectricityTiny >>> from capymoa.anomaly import OnlineIsolationForest >>> from capymoa.evaluation import AnomalyDetectionEvaluator >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = OnlineIsolationForest(schema=schema) >>> evaluator = AnomalyDetectionEvaluator(schema) >>> while stream.has_more_instances(): ... instance = stream.next_instance() ... proba = learner.score_instance(instance) ... evaluator.update(instance.y_index, proba) ... learner.train(instance) >>> auc = evaluator.auc() >>> print(f"AUC: {auc:.2f}") AUC: 0.52
- __init__(
- schema: Schema | None = None,
- random_seed: int = 1,
- num_trees: int = 32,
- max_leaf_samples: int = 32,
- growth_criterion: Literal['fixed', 'adaptive'] = 'adaptive',
- subsample: float = 1.0,
- window_size: int = 2048,
- branching_factor: int = 2,
- split: Literal['axisparallel'] = 'axisparallel',
- n_jobs: int = 1,
Construct an Online Isolation Forest anomaly detector
- Parameters:
schema – The schema of the stream. If not provided, it will be inferred from the data.
random_seed – Random seed for reproducibility.
num_trees – Number of trees in the ensemble.
window_size – The size of the window for each tree.
branching_factor – Branching factor of each tree.
max_leaf_samples – Maximum number of samples per leaf. When this number is reached, a split is performed.
growth_criterion – When to perform a split. If ‘adaptive’, the max_leaf_samples grows with tree depth, otherwise ‘fixed’.
subsample – Probability of learning a new sample in each tree.
split – Type of split performed at each node. Currently only ‘axisparallel’ is supported, which is the same type used by the IsolationForest algorithm.
n_jobs – Number of parallel jobs.