SAMkNN#

class capymoa.classifier.SAMkNN[source]#

Bases: MOAClassifier

Self Adjusted Memory k Nearest Neighbor.

Self Adjusted Memory k Nearest Neighbor (SAMkNN) [1] is a lazy classifier.

>>> from capymoa.classifier import SAMkNN
>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.evaluation import prequential_evaluation
>>>
>>> stream = ElectricityTiny()
>>> classifier = SAMkNN(stream.get_schema())
>>> results = prequential_evaluation(stream, classifier, max_instances=1000)
>>> print(f"{results['cumulative'].accuracy():.1f}")
78.6
__init__(
schema: Schema,
random_seed: int = 1,
k: int = 5,
limit: int = 5000,
min_stm_size: int = 50,
relative_ltm_size: float = 0.4,
recalculate_stm_error: bool = False,
)[source]#

Self Adjusted Memory k Nearest Neighbor (SAMkNN) Classifier

Parameters:
  • schema – The schema of the stream.

  • random_seed – The random seed passed to the MOA learner.

  • k – The number of nearest neighbors.

  • limit – The maximum number of instances to store.

  • min_stm_size – The minimum number of instances in the STM.

  • relative_ltm_size – The allowed LTM size relative to the total limit.

  • recalculate_stm_error – Recalculates the error rate of the STM for size adaption (Costly operation). Otherwise, an approximation is used.

cli_help()[source]#
predict(instance)[source]#

Predict the label of an instance.

The base implementation calls predict_proba() and returns the label with the highest probability.

Parameters:

instance – The instance to predict the label for.

Returns:

The predicted label or None if the classifier is unable to make a prediction.

predict_proba(instance)[source]#

Return probability estimates for each label.

Parameters:

instance – The instance to estimate the probabilities for.

Returns:

An array of probabilities for each label or None if the classifier is unable to make a prediction.

train(instance)[source]#

Train the classifier with a labeled instance.

Parameters:

instance – The labeled instance to train the classifier with.

random_seed: int#

The random seed for reproducibility.

When implementing a classifier ensure random number generators are seeded.

schema: Schema#

The schema representing the instances.