OzaBoost#

class capymoa.classifier.OzaBoost[source]#

Bases: MOAClassifier

Incremental on-line boosting classifier of Oza and Russell.

For the boosting method, Oza and Russell note that the weighting procedure of AdaBoost actually divides the total example weight into two halves – half of the weight is assigned to the correctly classified examples, and the other half goes to the misclassified examples. They use the Poisson distribution for deciding the random probability that an example is used for training, only this time the parameter changes according to the boosting weight of the example as it is passed through each model in sequence.

Reference:

Online bagging and boosting. Nikunj Oza, Stuart Russell. Artificial Intelligence and Statistics 2001.

Example usages:

>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.classifier import OzaBoost
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = OzaBoost(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
88.8
__init__(
schema: Schema | None = None,
random_seed: int = 0,
base_learner='trees.HoeffdingTree',
boosting_iterations: int = 10,
use_pure_boost: bool = False,
)[source]#

Incremental on-line boosting classifier of Oza and Russell.

Parameters:
  • schema – The schema of the stream.

  • random_seed – The random seed passed to the MOA learner.

  • base_learner – The base learner to be trained. Default trees.HoeffdingTree.

  • boosting_iterations – The number of boosting iterations.

  • use_pure_boost – Boost with weights only; no poisson..

CLI_help()[source]#
predict(instance)[source]#

Predict the label of an instance.

The base implementation calls predict_proba() and returns the label with the highest probability.

Parameters:

instance – The instance to predict the label for.

Returns:

The predicted label or None if the classifier is unable to make a prediction.

predict_proba(instance)[source]#

Return probability estimates for each label.

Parameters:

instance – The instance to estimate the probabilities for.

Returns:

An array of probabilities for each label or None if the classifier is unable to make a prediction.

train(instance)[source]#

Train the classifier with a labeled instance.

Parameters:

instance – The labeled instance to train the classifier with.

random_seed: int#

The random seed for reproducibility.

When implementing a classifier ensure random number generators are seeded.

schema: Schema#

The schema representing the instances.