FIMTDD#

class capymoa.regressor.FIMTDD[source]#

Bases: MOARegressor

Implementation of the FIMT-DD tree as described by Ikonomovska et al.

Fast Incremental Model Tree with Drift Detection is the regression version for the famous Hoeffding Tree for data stream learning.

FIMT-DD is implemented in MOA (Massive Online Analysis) and provides several parameters for customization.

Reference:

Ikonomovska, Elena, João Gama, and Sašo Džeroski. Learning model trees from evolving data streams. Data mining and knowledge discovery 23.1 (2011): 128-168.

Example usage:

>>> from capymoa.datasets import Fried
    >>> from capymoa.regressor import FIMTDD
    >>> from capymoa.evaluation import prequential_evaluation
>>> stream = Fried()
>>> schema = stream.get_schema()
>>> learner = FIMTDD(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].rmse()
7.363273627701553
__init__(
schema: Schema,
split_criterion: SplitCriterion | str = 'VarianceReductionSplitCriterion',
grace_period: int = 200,
split_confidence: float = 1e-07,
tie_threshold: float = 0.05,
page_hinckley_alpha: float = 0.005,
page_hinckley_threshold: int = 50,
alternate_tree_fading_factor: float = 0.995,
alternate_tree_t_min: int = 150,
alternate_tree_time: int = 1500,
regression_tree: bool = False,
learning_ratio: float = 0.02,
learning_ratio_decay_factor: float = 0.001,
learning_ratio_const: bool = False,
random_seed: int | None = None,
) None[source]#

Construct FIMTDD.

Parameters:
  • split_criterion – Split criterion to use.

  • grace_period – Number of instances a leaf should observe between split attempts.

  • split_confidence – Allowed error in split decision, values close to 0 will take long to decide.

  • tie_threshold – Threshold below which a split will be forced to break ties.

  • page_hinckley_alpha – Alpha value to use in the Page Hinckley change detection tests.

  • page_hinckley_threshold – Threshold value used in the Page Hinckley change detection tests.

  • alternate_tree_fading_factor – Fading factor used to decide if an alternate tree should replace an original.

  • alternate_tree_t_min – Tmin value used to decide if an alternate tree should replace an original.

  • alternate_tree_time – The number of instances used to decide if an alternate tree should be discarded.

  • regression_tree – Build a regression tree instead of a model tree.

  • learning_ratio – Learning ratio to used for training the Perceptrons in the leaves.

  • learning_ratio_decay_factor – Learning rate decay factor (not used when learning rate is constant).

  • learning_ratio_const – Keep learning rate constant instead of decaying.

CLI_help()[source]#
predict(instance)[source]#
train(instance)[source]#