SAMkNN#

class capymoa.classifier.SAMkNN[source]#

Bases: MOAClassifier

Self Adjusted Memory k Nearest Neighbor (SAMkNN) Classifier

Reference:

“KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift” Viktor Losing, Barbara Hammer and Heiko Wersing http://ieeexplore.ieee.org/document/7837853 PDF can be found at https://pub.uni-bielefeld.de/download/2907622/2907623 BibTex: “@INPROCEEDINGS{7837853, author={V. Losing and B. Hammer and H. Wersing}, booktitle={2016 IEEE 16th International Conference on Data Mining (ICDM)}, title={KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift}, year={2016}, ages={291-300}, keywords={data mining;optimisation;pattern classification;Big Data;Internet of Things;KNN classifier;SAM-kNN robustness;data mining;k nearest neighbor algorithm;metaparameter optimization;nonstationary data streams;performance evaluation;self adjusting memory model;Adaptation models;Benchmark testing;Biological system modeling;Data mining;Heuristic algorithms;Prediction algorithms;Predictive models;Data streams;concept drift;data mining;kNN}, doi={10.1109/ICDM.2016.0040}, month={Dec} }”

Example usages:

>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.classifier import SAMkNN
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = SAMkNN(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
78.60000000000001
__init__(
schema: Schema,
random_seed: int = 1,
k: int = 5,
limit: int = 5000,
min_stm_size: int = 50,
relative_ltm_size: float = 0.4,
recalculate_stm_error: bool = False,
)[source]#

Self Adjusted Memory k Nearest Neighbor (SAMkNN) Classifier

Parameters:
  • schema – The schema of the stream.

  • random_seed – The random seed passed to the MOA learner.

  • k – The number of nearest neighbors.

  • limit – The maximum number of instances to store.

  • min_stm_size – The minimum number of instances in the STM.

  • relative_ltm_size – The allowed LTM size relative to the total limit.

  • recalculate_stm_error – Recalculates the error rate of the STM for size adaption (Costly operation). Otherwise, an approximation is used.

CLI_help()[source]#
predict(instance)[source]#
predict_proba(instance)[source]#
train(instance)[source]#