
class capymoa.classifier.StreamingGradientBoostedTrees[source]#

Bases: MOAClassifier

Streaming Gradient Boosted Trees (SGBT) Classifier

Streaming Gradient Boosted Trees (SGBT), which is trained using weighted squared loss elicited in XGBoost. SGBT exploits trees with a replacement strategy to detect and recover from drifts, thus enabling the ensemble to adapt without sacrificing the predictive performance.


Gradient boosted trees for evolving data streams. Nuwan Gunasekara, Bernhard Pfahringer, Heitor Murilo Gomes, Albert Bifet. Machine Learning, Springer, 2024.

Example usages:

>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.classifier import StreamingGradientBoostedTrees
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = StreamingGradientBoostedTrees(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = StreamingGradientBoostedTrees(schema, base_learner='meta.AdaptiveRandomForestRegressor -s 10', boosting_iterations=10)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
schema: Schema | None = None,
random_seed: int = 0,
base_learner='trees.FIMTDD -s VarianceReductionSplitCriterion -g 25 -c 0.05 -e -p',
boosting_iterations: int = 100,
percentage_of_features: int = 75,
disable_one_hot: bool = False,
multiply_hessian_by: int = 1,
skip_training: int = 1,
use_squared_loss: bool = False,

Streaming Gradient Boosted Trees (SGBT) Classifier

  • schema – The schema of the stream.

  • random_seed – The random seed passed to the MOA learner.

  • base_learner – The base learner to be trained. Default FIMTDD -s VarianceReductionSplitCriterion -g 25 -c 0.05 -e -p.

  • boosting_iterations – The number of boosting iterations.

  • percentage_of_features – The percentage of features to use.

  • learning_rate – The learning rate.

  • disable_one_hot – Whether to disable one-hot encoding for regressors that supports nominal attributes.

  • multiply_hessian_by – The multiply hessian by this parameter to generate weights for multiple iterations.

  • skip_training – Skip training of 1/skip_training instances. skip_training=1 means no skipping is performed (train on all instances).

  • use_squared_loss – Whether to use squared loss for classification.
