TorchClassifyStream#

class capymoa.stream.TorchClassifyStream[source]#

Bases: Stream[LabeledInstance]

TorchClassifyStream turns a PyTorch dataset into a classification stream.

>>> from capymoa.evaluation import ClassificationEvaluator
...
>>> from capymoa.datasets import get_download_dir
>>> from capymoa.stream import TorchClassifyStream
>>> from torchvision import datasets
>>> from torchvision.transforms import ToTensor
>>> print("Using PyTorch Dataset"); pytorchDataset = datasets.FashionMNIST( 
...     root=get_download_dir(),
...     train=True,
...     download=True,
...     transform=ToTensor()
... )
Using PyTorch Dataset...
>>> pytorch_stream = TorchClassifyStream(pytorchDataset, 10, class_names=pytorchDataset.classes)
>>> pytorch_stream.get_schema()
@relation PytorchDataset

@attribute attrib_0 numeric
@attribute attrib_1 numeric
...
@attribute attrib_783 numeric
@attribute class {T-shirt/top,Trouser,Pullover,Dress,Coat,Sandal,Shirt,Sneaker,Bag,'Ankle boot'}

@data
>>> pytorch_stream.next_instance()
LabeledInstance(
    Schema(PytorchDataset),
    x=[0. 0. 0. ... 0. 0. 0.],
    y_index=9,
    y_label='Ankle boot'
)

You can construct TorchClassifyStream using a random sampler by passing a sampler to the constructor:

>>> import torch
>>> from torch.utils.data import RandomSampler, TensorDataset
>>> dataset = TensorDataset(
...     torch.tensor([[1], [2], [3]]), torch.tensor([0, 1, 2])
... )
>>> pytorch_stream = TorchClassifyStream(dataset=dataset, num_classes=3, shuffle=True)
>>> for instance in pytorch_stream:
...     print(instance.x)
[3]
[1]
[2]

Importantly you can restart the stream to iterate over the dataset in the same order again:

>>> pytorch_stream.restart()
>>> for instance in pytorch_stream:
...     print(instance.x)
[3]
[1]
[2]

CLI_help() → str[source]#: Return a help message

__init__( dataset: Dataset[Tuple[Tensor, LongTensor]], num_classes: int, shuffle: bool = False, shuffle_seed: int = 0, class_names: Sequence[str] | None = None, dataset_name: str = 'PytorchDataset', )[source]#

Create a stream from a PyTorch dataset.

Parameters:

dataset – A PyTorch dataset
num_classes – The number of classes in the dataset
shuffle – Randomly sample with replacement, defaults to False
shuffle_seed – Seed for shuffling, defaults to 0
class_names – The names of the classes, defaults to None
dataset_name – The name of the dataset, defaults to “PytorchDataset”

__iter__() → Iterator[_AnyInstance][source]#

Get an iterator over the stream.

This will NOT restart the stream if it has already been iterated over. Please use the restart() method to restart the stream.

Yield:: An iterator over the stream.

__next__() → _AnyInstance[source]#

Get the next instance in the stream.

Returns:: The next instance in the stream.

get_moa_stream()[source]#: Get the MOA stream object if it exists.

get_schema()[source]#: Return the schema of the stream.

has_more_instances()[source]#: Return True if the stream have more instances to read.

next_instance()[source]#

Return the next instance in the stream.

Raises:: ValueError – If the machine learning task is neither a regression nor a classification task.
Returns:: A labeled instances or a regression depending on the schema.

restart()[source]#: Restart the stream to read instances from the beginning.

TorchClassifyStream#

This Page