AdaptiveRandomForestRegressor#

class capymoa.regressor.AdaptiveRandomForestRegressor[source]#

Bases: MOARegressor

Adaptive Random Forest Regressor

This class implements the Adaptive Random Forest (ARF) algorithm, which is an ensemble regressor capable of adapting to concept drift.

ARF is implemented in MOA (Massive Online Analysis) and provides several parameters for customization.

See also capymoa.classifier.AdaptiveRandomForestClassifier See capymoa.base.MOARegressor for train and predict.

Reference:

Adaptive random forests for data stream regression. Heitor Murilo Gomes, J. P. Barddal, L. E. B. Ferreira, A. Bifet. ESANN, pp. 267-272, 2018.

Example usage:

>>> from capymoa.datasets import Fried
>>> from capymoa.regressor import AdaptiveRandomForestRegressor
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = Fried()
>>> schema = stream.get_schema()
>>> learner = AdaptiveRandomForestRegressor(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].rmse()
3.659072011685404
__init__(
schema=None,
CLI=None,
random_seed=1,
tree_learner=None,
ensemble_size=100,
max_features=0.6,
lambda_param=6.0,
drift_detection_method=None,
warning_detection_method=None,
disable_drift_detection=False,
disable_background_learner=False,
)[source]#

Construct an Adaptive Random Forest Regressor

Parameters:
  • schema – The schema of the stream. If not provided, it will be inferred from the data.

  • CLI – Command Line Interface (CLI) options for configuring the ARF algorithm. If not provided, default options will be used.

  • random_seed – Seed for the random number generator.

  • tree_learner – The tree learner to use. If not provided, a default Hoeffding Tree is used.

  • ensemble_size – The number of trees in the ensemble.

  • max_features – The maximum number of features to consider when splitting a node. If provided as a float between 0.0 and 1.0, it represents the percentage of features to consider. If provided as an integer, it specifies the exact number of features to consider. If provided as the string “sqrt”, it indicates that the square root of the total number of features. If not provided, the default value is 60%.

  • lambda_param – The lambda parameter that controls the Poisson distribution for the online bagging simulation.

  • drift_detection_method – The method used for drift detection.

  • warning_detection_method – The method used for warning detection.

  • disable_drift_detection – Whether to disable drift detection.

  • disable_background_learner – Whether to disable background learning.

CLI_help()[source]#
predict(instance)[source]#
train(instance)[source]#