HoeffdingTree#

class capymoa.classifier.HoeffdingTree[source]#

Hoeffding Tree classifier.

Parameters#

schema: The schema of the stream
random_seed: The random seed passed to the moa learner
grace_period: Number of instances a leaf should observe between split attempts.
split_criterion: Split criterion to use. Defaults to InfoGainSplitCriterion
confidence: Significance level to calculate the Hoeffding bound. The significance level is given by 1 - delta. Values closer to zero imply longer split decision delays.
tie_threshold: Threshold below which a split will be forced to break ties.
leaf_prediction: Prediction mechanism used at leafs.</br> - 0 - Majority Class</br> - 1 - Naive Bayes</br> - 2 - Naive Bayes Adaptive</br>
nb_threshold: Number of instances a leaf should observe before allowing Naive Bayes.
numeric_attribute_observer: The Splitter or Attribute Observer (AO) used to monitor the class statistics of numeric features and perform splits.
binary_split: If True, only allow binary splits.
max_byte_size: The max size of the tree, in bytes.
memory_estimate_period: Interval (number of processed instances) between memory consumption checks.
stop_mem_management: If True, stop growing as soon as memory limit is hit.
remove_poor_attrs: If True, disable poor attributes to reduce memory usage.
disable_prepruning: If True, disable merit-based tree pre-pruning.

__init__( schema: Schema | None = None, random_seed: int = 0, grace_period: int = 200, split_criterion: str | SplitCriterion = 'InfoGainSplitCriterion', confidence: float = 1e-3, tie_threshold: float = 0.05, leaf_prediction: int = 'NaiveBayesAdaptive', nb_threshold: int = 0, numeric_attribute_observer: str = 'GaussianNumericAttributeClassObserver', binary_split: bool = False, max_byte_size: float = 33554433, memory_estimate_period: int = 1000000, stop_mem_management: bool = True, remove_poor_attrs: bool = False, disable_prepruning: bool = True, )[source]#

predict(instance)[source]#

Predict the label of an instance.

The base implementation calls predict_proba() and returns the label with the highest probability.

Parameters:: instance – The instance to predict the label for.
Returns:: The predicted label or None if the classifier is unable to make a prediction.

predict_proba(instance)[source]#

Return probability estimates for each label.

Parameters:: instance – The instance to estimate the probabilities for.
Returns:: An array of probabilities for each label or None if the classifier is unable to make a prediction.

train(instance)[source]#

Train the classifier with a labeled instance.

Parameters:: instance – The labeled instance to train the classifier with.

random_seed: int#

The random seed for reproducibility.

When implementing a classifier ensure random number generators are seeded.