SAMkNN#

class capymoa.classifier.SAMkNN[source]#

Bases: MOAClassifier

Self Adjusted Memory k Nearest Neighbor.

Self Adjusted Memory k Nearest Neighbor (SAMkNN) [1] is a lazy classifier.

>>> from capymoa.classifier import SAMkNN
>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.evaluation import prequential_evaluation
>>>
>>> stream = ElectricityTiny()
>>> classifier = SAMkNN(stream.get_schema())
>>> results = prequential_evaluation(stream, classifier, max_instances=1000)
>>> print(f"{results['cumulative'].accuracy():.1f}")
78.6

__init__( schema: Schema, random_seed: int = 1, k: int = 5, limit: int = 5000, min_stm_size: int = 50, relative_ltm_size: float = 0.4, recalculate_stm_error: bool = False, )[source]#

Self Adjusted Memory k Nearest Neighbor (SAMkNN) Classifier

Parameters:

schema – The schema of the stream.
random_seed – The random seed passed to the MOA learner.
k – The number of nearest neighbors.
limit – The maximum number of instances to store.
min_stm_size – The minimum number of instances in the STM.
relative_ltm_size – The allowed LTM size relative to the total limit.
recalculate_stm_error – Recalculates the error rate of the STM for size adaption (Costly operation). Otherwise, an approximation is used.

cli_help()[source]#

predict(instance: Instance) → int | None[source]#

Predict the label of an instance.

The base implementation calls predict_proba() and returns the label with the highest probability.

Parameters:: instance – The instance to predict the label for.
Returns:: The predicted label or None if the classifier is unable to make a prediction.

predict_proba( instance, ) → ndarray[tuple[Any, ...], dtype[float64]] | None[source]#

Return probability estimates for each label.

Parameters:: instance – The instance to estimate the probabilities for.
Returns:: An array of probabilities for each label or None if the classifier is unable to make a prediction.

train(instance)[source]#

Train the classifier with a labeled instance.

Parameters:: instance – The labeled instance to train the classifier with.

random_seed: int#

The random seed for reproducibility.

When implementing a classifier ensure random number generators are seeded.

schema: Schema#: The schema representing the instances.