class capymoa.regressor.FIMTDD[source]#

Bases: MOARegressor

Implementation of the FIMT-DD tree as described by Ikonomovska et al.

Fast Incremental Model Tree with Drift Detection is the regression version for the famous Hoeffding Tree for data stream learning.

FIMT-DD is implemented in MOA (Massive Online Analysis) and provides several parameters for customization.


Ikonomovska, Elena, João Gama, and Sašo Džeroski. Learning model trees from evolving data streams. Data mining and knowledge discovery 23.1 (2011): 128-168.

Example usage:

>>> from capymoa.datasets import Fried
    >>> from capymoa.regressor import FIMTDD
    >>> from capymoa.evaluation import prequential_evaluation
>>> stream = Fried()
>>> schema = stream.get_schema()
>>> learner = FIMTDD(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].rmse()
schema: Schema,
split_criterion: SplitCriterion | str = 'VarianceReductionSplitCriterion',
grace_period: int = 200,
split_confidence: float = 1e-07,
tie_threshold: float = 0.05,
page_hinckley_alpha: float = 0.005,
page_hinckley_threshold: int = 50,
alternate_tree_fading_factor: float = 0.995,
alternate_tree_t_min: int = 150,
alternate_tree_time: int = 1500,
regression_tree: bool = False,
learning_ratio: float = 0.02,
learning_ratio_decay_factor: float = 0.001,
learning_ratio_const: bool = False,
random_seed: int | None = None,
) None[source]#

Construct FIMTDD.

  • split_criterion – Split criterion to use.

  • grace_period – Number of instances a leaf should observe between split attempts.

  • split_confidence – Allowed error in split decision, values close to 0 will take long to decide.

  • tie_threshold – Threshold below which a split will be forced to break ties.

  • page_hinckley_alpha – Alpha value to use in the Page Hinckley change detection tests.

  • page_hinckley_threshold – Threshold value used in the Page Hinckley change detection tests.

  • alternate_tree_fading_factor – Fading factor used to decide if an alternate tree should replace an original.

  • alternate_tree_t_min – Tmin value used to decide if an alternate tree should replace an original.

  • alternate_tree_time – The number of instances used to decide if an alternate tree should be discarded.

  • regression_tree – Build a regression tree instead of a model tree.

  • learning_ratio – Learning ratio to used for training the Perceptrons in the leaves.

  • learning_ratio_decay_factor – Learning rate decay factor (not used when learning rate is constant).

  • learning_ratio_const – Keep learning rate constant instead of decaying.
