BatchClassifier#
- class capymoa.base.BatchClassifier[source]#
Bases:
Classifier
,Batch
,ABC
Base class for classifiers that support mini-batches.
Supported by:
Evaluators that support batch classifiers will call the
batch_train()
andbatch_predict_proba()
methods instead oftrain()
andpredict_proba()
:>>> from capymoa.base import BatchClassifier >>> from capymoa.datasets import ElectricityTiny >>> from capymoa.evaluation import prequential_evaluation >>> >>> batch_size = 500 >>> class MyBatchClassifier(BatchClassifier): ... def batch_train(self, x, y): ... print(f"batch_train x: {x.shape} {x.dtype}") ... print(f"batch_train y: {y.shape} {y.dtype}") ... ... def batch_predict_proba(self, x): ... print(f"batch_predict_proba x: {x.shape} {x.dtype}") ... return torch.zeros((x.shape[0], self.schema.get_num_classes())) ... >>> stream = ElectricityTiny() >>> learner = MyBatchClassifier(stream.schema) >>> _ = prequential_evaluation( ... stream, ... learner, ... batch_size=batch_size, ... max_instances=721 ... ) batch_predict_proba x: torch.Size([500, 6]) torch.float32 batch_train x: torch.Size([500, 6]) torch.float32 batch_train y: torch.Size([500]) torch.int64 batch_predict_proba x: torch.Size([221, 6]) torch.float32 batch_train x: torch.Size([221, 6]) torch.float32 batch_train y: torch.Size([221]) torch.int64
You can manually use
itertools.batched
(python 3.12) function andnp.stack
to collect batches of instances as a matrix:>>> from itertools import islice >>> from capymoa._utils import batched # Not available in python < 3.12 >>> stream.restart() # streams are stateful, so restart it >>> for i, batch in enumerate(batched(stream, 100)): ... x = np.stack([instance.x for instance in batch]) ... y = np.stack([instance.y_index for instance in batch]) ... x = torch.from_numpy(x).to(learner.device, learner.x_dtype) ... y = torch.from_numpy(y).to(learner.device, learner.y_dtype) ... learner.batch_train(x, y) ... break batch_train x: torch.Size([100, 6]) torch.float32 batch_train y: torch.Size([100]) torch.int64
The default implementation of
train()
andpredict()
calls the batch variants with a batch of size 1. This is useful for parts of CapyMOA that expect a classifier to be able to train and predict on single instances.>>> instance = next(stream) >>> learner.train(instance) batch_train x: torch.Size([1, 6]) torch.float32 batch_train y: torch.Size([1]) torch.int64 >>> learner.predict(instance) batch_predict_proba x: torch.Size([1, 6]) torch.float32 np.int64(0) >>> learner.predict_proba(instance) batch_predict_proba x: torch.Size([1, 6]) torch.float32 array([0., 0.], dtype=float32)
- abstract batch_predict_proba(x: Tensor) Tensor [source]#
Predict the probabilities of the classes for a batch of instances.
- predict(instance: Instance) int | None [source]#
Predict the label of an instance.
The base implementation calls
predict_proba()
and returns the label with the highest probability.- Parameters:
instance – The instance to predict the label for.
- Returns:
The predicted label or
None
if the classifier is unable to make a prediction.
- predict_proba(
- instance: Instance,
Calls
batch_predict_proba()
with a batch of size 1.
- train(instance: LabeledInstance) None [source]#
Calls
batch_train()
with a batch of size 1.
- random_seed: int#
The random seed for reproducibility.
When implementing a classifier ensure random number generators are seeded.