BatchRegressor#

class capymoa.base.BatchRegressor[source]#

Bases: Regressor, Batch, ABC

Base class for regressor that support mini-batches.

Supported by:

Evaluators that support batch classifiers will call the batch_train() and batch_predict() methods instead of train() and predict():

>>> from capymoa.base import BatchRegressor
>>> from capymoa.datasets import FriedTiny
>>> from capymoa.evaluation import prequential_evaluation
>>>
>>> batch_size = 500
>>> class MyBatchRegressor(BatchRegressor):
...     def batch_train(self, x, y):
...         print(f"batch_train x: {x.shape} {x.dtype}")
...         print(f"batch_train y: {y.shape} {y.dtype}")
...
...     def batch_predict(self, x):
...         print(f"batch_predict x: {x.shape} {x.dtype}")
...         return np.zeros((x.shape[0],))
...
>>> stream = FriedTiny()
>>> learner = MyBatchRegressor(stream.schema)
>>> _ = prequential_evaluation(
...     stream,
...     learner,
...     batch_size=batch_size,
...     max_instances=721
... )
batch_predict x: torch.Size([500, 10]) torch.float32
batch_train x: torch.Size([500, 10]) torch.float32
batch_train y: torch.Size([500]) torch.float32
batch_predict x: torch.Size([221, 10]) torch.float32
batch_train x: torch.Size([221, 10]) torch.float32
batch_train y: torch.Size([221]) torch.float32

You can manually use itertools.batched (python 3.12) function and np.stack to collect batches of instances as a matrix:

>>> from itertools import islice
>>> from capymoa._utils import batched # Not available in python < 3.12
>>> for i, batch in enumerate(batched(stream, 100)):
...     x = np.stack([instance.x for instance in batch])
...     y = np.stack([instance.y_value for instance in batch])
...     x = torch.from_numpy(x).to(dtype=learner.x_dtype, device=learner.device)
...     y = torch.from_numpy(y).to(dtype=learner.y_dtype, device=learner.device)
...     learner.batch_train(x, y)
...     break
batch_train x: torch.Size([100, 10]) torch.float32
batch_train y: torch.Size([100]) torch.float32

The default implementation of train() and predict() calls the batch variants with a batch of size 1. This is useful for parts of CapyMOA that expect a classifier to be able to train and predict on single instances.

>>> instance = next(stream)
>>> learner.train(instance)
batch_train x: torch.Size([1, 10]) torch.float32
batch_train y: torch.Size([]) torch.float32
>>> learner.predict(instance)
batch_predict x: torch.Size([1, 10]) torch.float64
np.float64(0.0)
__init__(schema=None, random_seed=1)[source]#
abstract batch_predict(x: Tensor) Tensor[source]#

Return probability estimates for each label in a batch.

Parameters:

x – Batch of x_dtype valued feature vectors (batch_size, num_features)

Returns:

Predicted batch of y_dtype valued targets (batch_size,).

abstract batch_train(x: Tensor, y: Tensor) None[source]#

Train the classifier with a batch of instances.

Parameters:
  • x – Batch of x_dtype valued feature vectors (batch_size, num_features)

  • y – Batch of y_dtype valued targets (batch_size,).

predict(instance: RegressionInstance) float64[source]#

Calls batch_predict() with a batch of size 1.

train(instance: RegressionInstance) None[source]#

Calls batch_train() with a batch of size 1.

device: device = device(type='cpu')#

Device on which the batch will be processed.

x_dtype: dtype = torch.float32[source]#

Data type for the input features.

y_dtype: dtype = torch.float32[source]#

Data type for the target value/labels.