TorchClassifyStream#
- class capymoa.stream.TorchClassifyStream[source]#
Bases:
Stream
[LabeledInstance
]TorchClassifyStream turns a PyTorch dataset into a classification stream.
>>> from capymoa.evaluation import ClassificationEvaluator ... >>> from capymoa.datasets import get_download_dir >>> from capymoa.stream import TorchClassifyStream >>> from torchvision import datasets >>> from torchvision.transforms import ToTensor >>> print("Using PyTorch Dataset"); pytorchDataset = datasets.FashionMNIST( ... root=get_download_dir(), ... train=True, ... download=True, ... transform=ToTensor() ... ) Using PyTorch Dataset... >>> pytorch_stream = TorchClassifyStream(pytorchDataset, 10, class_names=pytorchDataset.classes) >>> pytorch_stream.get_schema() @relation PytorchDataset @attribute attrib_0 numeric @attribute attrib_1 numeric ... @attribute attrib_783 numeric @attribute class {T-shirt/top,Trouser,Pullover,Dress,Coat,Sandal,Shirt,Sneaker,Bag,'Ankle boot'} @data >>> pytorch_stream.next_instance() LabeledInstance( Schema(PytorchDataset), x=[0. 0. 0. ... 0. 0. 0.], y_index=9, y_label='Ankle boot' )
You can construct
TorchClassifyStream
using a random sampler by passing a sampler to the constructor:>>> import torch >>> from torch.utils.data import RandomSampler, TensorDataset >>> dataset = TensorDataset( ... torch.tensor([[1], [2], [3]]), torch.tensor([0, 1, 2]) ... ) >>> pytorch_stream = TorchClassifyStream(dataset=dataset, num_classes=3, shuffle=True) >>> for instance in pytorch_stream: ... print(instance.x) [3] [1] [2]
Importantly you can restart the stream to iterate over the dataset in the same order again:
>>> pytorch_stream.restart() >>> for instance in pytorch_stream: ... print(instance.x) [3] [1] [2]
- __init__(
- dataset: Dataset[Tuple[Tensor, LongTensor]],
- num_classes: int,
- shuffle: bool = False,
- shuffle_seed: int = 0,
- class_names: Sequence[str] | None = None,
- dataset_name: str = 'PytorchDataset',
Create a stream from a PyTorch dataset.
- Parameters:
dataset – A PyTorch dataset
num_classes – The number of classes in the dataset
shuffle – Randomly sample with replacement, defaults to False
shuffle_seed – Seed for shuffling, defaults to 0
class_names – The names of the classes, defaults to None
dataset_name – The name of the dataset, defaults to “PytorchDataset”
- next_instance()[source]#
Return the next instance in the stream.
- Raises:
ValueError – If the machine learning task is neither a regression nor a classification task.
- Returns:
A labeled instances or a regression depending on the schema.