HoeffdingAdaptiveTree#
- class capymoa.classifier.HoeffdingAdaptiveTree[source]#
Bases:
HoeffdingTree
Hoeffding Adaptive Tree (HAT) classifier.
Reference:
Bifet, A. and Gavalda, R., 2009. Adaptive learning from evolving data streams. In Advances in Intelligent Data Analysis VIII: 8th International Symposium on Intelligent Data Analysis, IDA 2009, Lyon, France, August 31-September 2, 2009. Proceedings 8 (pp. 249-260). Springer Berlin Heidelberg. https://link.springer.com/chapter/10.1007/978-3-642-03915-7_22
>>> from capymoa.datasets import ElectricityTiny >>> from capymoa.classifier import HoeffdingAdaptiveTree >>> from capymoa.evaluation import prequential_evaluation >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = HoeffdingAdaptiveTree(schema) >>> results = prequential_evaluation(stream, learner, max_instances=1000) >>> results["cumulative"].accuracy() 84.1
- __init__(
- schema: Schema,
- random_seed: int = 0,
- grace_period: int = 200,
- split_criterion: str | SplitCriterion = 'InfoGainSplitCriterion',
- confidence: float = 0.001,
- tie_threshold: float = 0.05,
- leaf_prediction: int = 'NaiveBayesAdaptive',
- nb_threshold: int = 0,
- numeric_attribute_observer: str = 'GaussianNumericAttributeClassObserver',
- binary_split: bool = False,
- max_byte_size: float = 33554433,
- memory_estimate_period: int = 1000000,
- stop_mem_management: bool = True,
- remove_poor_attrs: bool = False,
- disable_prepruning: bool = True,
Hoeffding Adaptive Tree (HAT) classifier.
- Parameters:
schema – the schema of the stream.
random_seed – the random seed passed to the moa learner.
grace_period – the number of instances a leaf should observe between split attempts.
split_criterion – the split criterion to use. Defaults to InfoGainSplitCriterion.
confidence – the confidence level to calculate the Hoeffding Bound (1 - delta). Defaults to 1e-3. Values closer to zero imply longer split decision delays.
tie_threshold – the threshold below which a split will be forced to break ties.
leaf_prediction – the Prediction mechanism used at leafs.</br> - 0 - Majority Class</br> - 1 - Naive Bayes</br> - 2 - Naive Bayes Adaptive</br>
nb_threshold – the number of instances a leaf should observe before allowing Naive Bayes.
numeric_attribute_observer – the Splitter or Attribute Observer (AO) used to monitor the class statistics of numeric features and perform splits.
binary_split – If True, only allow binary splits.
max_byte_size – the max size of the tree, in bytes.
memory_estimate_period – Interval (number of processed instances) between memory consumption checks.
stop_mem_management – If True, stop growing as soon as memory limit is hit.
remove_poor_attrs – If True, disable poor attributes to reduce memory usage.
disable_prepruning – If True, disable merit-based tree pre-pruning.