StreamingGradientBoostedTrees#
- class capymoa.classifier.StreamingGradientBoostedTrees[source]#
Bases:
MOAClassifier
Streaming Gradient Boosted Trees (SGBT) Classifier
Streaming Gradient Boosted Trees (SGBT), which is trained using weighted squared loss elicited in XGBoost. SGBT exploits trees with a replacement strategy to detect and recover from drifts, thus enabling the ensemble to adapt without sacrificing the predictive performance.
Reference:
Example usages:
>>> from capymoa.datasets import ElectricityTiny >>> from capymoa.classifier import StreamingGradientBoostedTrees >>> from capymoa.evaluation import prequential_evaluation >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = StreamingGradientBoostedTrees(schema) >>> results = prequential_evaluation(stream, learner, max_instances=1000) >>> results["cumulative"].accuracy() 86.3 >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = StreamingGradientBoostedTrees(schema, base_learner='meta.AdaptiveRandomForestRegressor -s 10', boosting_iterations=10) >>> results = prequential_evaluation(stream, learner, max_instances=1000) >>> results["cumulative"].accuracy() 86.8
- __init__(
- schema: Schema | None = None,
- random_seed: int = 0,
- base_learner='trees.FIMTDD -s VarianceReductionSplitCriterion -g 25 -c 0.05 -e -p',
- boosting_iterations: int = 100,
- percentage_of_features: int = 75,
- learning_rate=0.0125,
- disable_one_hot: bool = False,
- multiply_hessian_by: int = 1,
- skip_training: int = 1,
- use_squared_loss: bool = False,
Streaming Gradient Boosted Trees (SGBT) Classifier
- Parameters:
schema – The schema of the stream.
random_seed – The random seed passed to the MOA learner.
base_learner – The base learner to be trained. Default FIMTDD -s VarianceReductionSplitCriterion -g 25 -c 0.05 -e -p.
boosting_iterations – The number of boosting iterations.
percentage_of_features – The percentage of features to use.
learning_rate – The learning rate.
disable_one_hot – Whether to disable one-hot encoding for regressors that supports nominal attributes.
multiply_hessian_by – The multiply hessian by this parameter to generate weights for multiple iterations.
skip_training – Skip training of 1/skip_training instances. skip_training=1 means no skipping is performed (train on all instances).
use_squared_loss – Whether to use squared loss for classification.