BatchClassifier#

class capymoa.base.BatchClassifier[source]#

Bases: Classifier

Base class for batch trained classifiers.

>>> class MyBatchClassifier(BatchClassifier):
...     def __str__(self):
...         return "MyBatchClassifier"
...
...     def predict(self, instance):
...         return None
...
...     def predict_proba(self, instance):
...         return None
...
...     def batch_train(self, x, y):
...         with np.printoptions(precision=2):
...             print(x)
...             print(y)
...             print()
...
>>> from capymoa.datasets import ElectricityTiny
>>> stream = ElectricityTiny()
...
>>> learner = MyBatchClassifier(stream.schema, batch_size=2)
>>> for _ in range(4):
...     learner.train(stream.next_instance())
[[0.   0.06 0.44 0.   0.42 0.41]
 [0.02 0.05 0.42 0.   0.42 0.41]]
[1 1]

[[0.04 0.05 0.39 0.   0.42 0.41]
 [0.06 0.05 0.31 0.   0.42 0.41]]
[1 1]
__init__(
schema: Schema,
batch_size: int,
random_seed: int = 1,
) None[source]#

Initialize the batch classifier.

Parameters:
  • schema – A schema used to allocate memory for the batch.

  • batch_size – The size of the batch.

  • random_seed – The random seed for reproducibility.

train(instance: LabeledInstance) None[source]#

Collate instances into a batch and call batch_train().

abstract batch_train(
x: ndarray[Any, dtype[number]],
y: ndarray[Any, dtype[integer]],
) None[source]#

Train the classifier with a batch of instances.

Parameters:
  • x – A real valued matrix of shape (batch_size, num_attributes) containing a batch of feature vectors.

  • y – An integer array of shape (batch_size,) containing the label index. Missing labels are coded as -1 in the semi-supervised setting.

predict(instance: Instance) int | None[source]#

Predict the label of an instance.

The base implementation calls predict_proba() and returns the label with the highest probability.

Parameters:

instance – The instance to predict the label for.

Returns:

The predicted label or None if the classifier is unable to make a prediction.

abstract predict_proba(
instance: Instance,
) ndarray[Any, dtype[float64]] | None[source]#

Return probability estimates for each label.

Parameters:

instance – The instance to estimate the probabilities for.

Returns:

An array of probabilities for each label or None if the classifier is unable to make a prediction.

random_seed: int#

The random seed for reproducibility.

When implementing a classifier ensure random number generators are seeded.

schema: Schema#

The schema representing the instances.