CSMOTE#

class capymoa.classifier.CSMOTE[source]#

Bases: MOAClassifier

CSMOTE

This strategy saves all the minority samples in a window managed by ADWIN. Meanwhile, a model is trained with the input data. When the minority sample ratio falls below a certain threshold, an online version of SMOTE is applied. A random minority sample is chosen from the window, and a new synthetic sample is generated until the minority sample ratio is greater than or equal to the threshold. The model is then trained with the newly generated samples.

Reference: Alessio Bernardo, Heitor Murilo Gomes, Jacob Montiel, Bernhard Pfahringer, Albert Bifet, Emanuele Della Valle. C-SMOTE: Continuous Synthetic Minority Oversampling for Evolving Data Streams. In BigData, IEEE, 2020.

Example usages:

>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.classifier import CSMOTE
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = CSMOTE(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
83.1
__init__(
schema: Schema = None,
random_seed: int = 0,
base_learner='trees.HoeffdingTree',
neighbors: int = 10,
threshold: float = 0.5,
min_size_allowed: int = 100,
disable_drift_detection: bool = False,
)[source]#

Incremental on-line boosting classifier of Oza and Russell.

Parameters:
  • schema – The schema of the stream.

  • random_seed – The random seed passed to the MOA learner.

  • base_learner – The base learner to be trained. Default AdaptiveRandomForestClassifier.

  • neighbors – Number of neighbors for SMOTE.

  • threshold – Minority class samples threshold.

  • min_size_allowed – Minimum number of samples in the minority class for applying SMOTE.

  • disable_drift_detection – If set, disables ADWIN drift detector

CLI_help()[source]#
predict(instance)[source]#
predict_proba(instance)[source]#
train(instance)[source]#