SOKNL#

class capymoa.regressor.SOKNL[source]#

Bases: MOARegressor

Self-Optimising K-Nearest Leaves (SOKNL) Implementation.

SOKNL extends the AdaptiveRandomForestRegressor by limiting the number of base trees involved in predicting a given instance. This approach overrides the aggregation strategy used for voting, leading to more accurate prediction in general.

Specifically, each leaf in the forest stores the sum of each feature and builds the “centroid” upon request. The centroids then are used to calculate the Euclidean distance between the incoming instance and the leaf. The incoming instance gets the aggregation from k trees with closer leaves as the final prediction. The performances of all possible k value are accessed over time and next prediction takes the best k so far.

See also capymoa.regressor.AdaptiveRandomForestRegressor See capymoa.base.MOARegressor for train and predict.

Reference:

Sun, Yibin, Bernhard Pfahringer, Heitor Murilo Gomes, and Albert Bifet. “SOKNL: A novel way of integrating K-nearest neighbours with adaptive random forest regression for data streams.” Data Mining and Knowledge Discovery 36, no. 5 (2022): 2006-2032.

Example usage:

>>> from capymoa.datasets import Fried
    >>> from capymoa.regressor import SOKNL
    >>> from capymoa.evaluation import prequential_evaluation
>>> stream = Fried()
>>> schema = stream.get_schema()
>>> learner = SOKNL(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].rmse()
3.3738337530234306
__init__(
schema=None,
CLI=None,
random_seed=1,
tree_learner=None,
ensemble_size=100,
max_features=0.6,
lambda_param=6.0,
drift_detection_method=None,
warning_detection_method=None,
disable_drift_detection=False,
disable_background_learner=False,
disable_self_optimising=False,
k_value=10,
)[source]#
CLI_help()[source]#
predict(instance)[source]#
train(instance)[source]#