ShrubsClassifier#

class capymoa.classifier.ShrubsClassifier[source]#

Bases: _ShrubEnsembles, Classifier

ShrubsClassifier

This class implements the ShrubEnsembles algorithm for classification, which is an ensemble classifier that continuously adds decision trees to the ensemble by training new trees over a sliding window while pruning unnecessary trees away using proximal (stochastic) gradient descent, hence allowing for adaptation to concept drift.

Reference:

Shrub Ensembles for Online Classification Sebastian Buschjäger, Sibylle Hess, and Katharina Morik In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), Jan 2022.

Example usage:

>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.classifier import ShrubsClassifier
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = ShrubsClassifier(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
85.5...
__init__(
schema: Schema,
loss: Literal['mse', 'ce', 'h2'] = 'ce',
step_size: float | Literal['adaptive'] = 'adaptive',
ensemble_regularizer: Literal['hard-L0', 'L0', 'L1', 'none'] = 'hard-L0',
l_ensemble_reg: float | int = 32,
l_l2_reg: float = 0,
l_tree_reg: float = 0,
normalize_weights: bool = True,
burnin_steps: int = 5,
update_leaves: bool = False,
batch_size: int = 32,
sk_dt: DecisionTreeClassifier = DecisionTreeClassifier(random_state=1234),
)[source]#

Initializes the ShrubEnsemble classifier with the given parameters.

Parameters:
  • loss – The loss function to be used. Supported values are "mse", "ce", and "h2".

  • step_size – The step size (i.e. learning rate of SGD) for updating the model. Can be a float or “adaptive”. Adaptive reduces the step size with more estimators, i.e. sets it to 1.0 / (n_estimators + 1.0)

  • ensemble_regularizer

    The regularizer for the weights of the ensemble. Supported values are:

    • hard-L0: L0 regularization via the prox-operator.

    • L0: L0 regularization via projection.

    • L1: L1 regularization via projection.

    • none: No regularization.

    Projection can be viewed as a softer regularization that drives the weights of each member towards 0, whereas hard-l0 limits the number of trees in the entire ensemble.

  • l_ensemble_reg

    The regularization strength. Depending on the value of ensemble_regularizer, this parameter has different meanings:

    • hard-L0: then this parameter represent the total number of trees in the ensembles.

    • L0 or L1: then this parameter is the regularization strength. In these cases the number of trees grow over time and only trees that do not contribute to the ensemble will be removed.

    • none: then this parameter is ignored.

  • l_l2_reg – The L2 regularization strength of the weights of each tree.

  • l_tree_reg – The regularization parameter for individual trees. Must be greater than or equal to 0. l_tree_reg controls the number of (overly) large trees in the ensemble by punishing the weights of each tree. Formally, the number of nodes of each tree is used as an additional regularizer.

  • normalize_weights – Whether to normalize the weights of the ensemble, i.e. the weight sum to 1.

  • burnin_steps – The number of burn-in steps before updating the model, i.e. the number of SGD steps to be take per each call of train

  • update_leaves – Whether to update the leaves of the trees as well using SGD.

  • batch_size – The batch size for training each individual tree. Internally, a sliding window is stored. Must be greater than or equal to 1.

  • sk_dt – Base object which is used to clone any new decision trees from. Note, that if you set random_state to an integer the exact same clone is used for any DT object

predict_proba(instance)[source]#
predict(instance)[source]#
train(instance)[source]#