DomainCIFAR100#
- class capymoa.ocl.datasets.DomainCIFAR100[source]#
Bases:
_TorchVisionDownload,_BuiltInCIScenarioDomain incremental CIFAR-100 variant with 20 classes per task.
This dataset has exactly 5 tasks. Each task contains one fine-grained class from each CIFAR-100 superclass (20 classes per task), while labels are remapped to the 20 superclass IDs. For example, the “flowers” superclass contains various types of flowers.
Note that the groupings are subjective based on the original CIFAR-100’s coarse labels.
References:
Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images.
- __init__(
- shuffle_data: bool = True,
- seed: int = 0,
- directory: Path = get_download_dir(),
- auto_download: bool = True,
- train_transform: Callable[[Any], Tensor] | None = None,
- test_transform: Callable[[Any], Tensor] | None = None,
- normalize_features: bool = False,
Create the DomainCIFAR100 scenario.
This scenario always uses 5 tasks and 20 superclass labels. Each task contains one fine-grained class from each superclass.
- Parameters:
shuffle_data – If True, shuffles class order within each superclass before forming tasks, and shuffles samples within each task for training.
seed – Random seed for reproducible shuffling.
directory – Directory where CIFAR-100 is stored/downloaded.
auto_download – If True, downloads CIFAR-100 when missing.
train_transform – Optional transform applied to training images.
test_transform – Optional transform applied to test images.
normalize_features – If True, applies dataset normalization after the provided transforms.
- test_loaders(
- batch_size: int,
- **kwargs: Any,
Get the training streams for the scenario.
- Parameters:
batch_size – Collects vectors in batches of this size.
kwargs – Additional keyword arguments to pass to the DataLoader.
- Returns:
A data loader for each task.
- train_loaders(
- batch_size: int,
- shuffle: bool = False,
- **kwargs: Any,
Get the training streams for the scenario.
The order of the tasks is fixed and does not change between iterations. The datasets themselves are shuffled in
__init__()if shuffle_data is set to True. This is because the order of data is important in online learning since the learner can only see each example once.
- Parameters:
batch_size – Collects vectors in batches of this size.
kwargs – Additional keyword arguments to pass to the DataLoader.
- Returns:
A data loader for each task.
- classes = ['aquatic_mammals', 'fish', 'flowers', 'food_containers', 'fruit_and_vegetables', 'household_electrical_devices', 'household_furniture', 'insects', 'large_carnivores', 'large_man-made_outdoor_things', 'large_natural_outdoor_scenes', 'large_omnivores_and_herbivores', 'medium_mammals', 'non-insect_invertebrates', 'people', 'reptiles', 'small_mammals', 'trees', 'vehicles_1', 'vehicles_2']#
The 20 superclasses of CIFAR-100, which are used as the labels in this scenario.
- default_task_count: int = 5#
The default number of tasks in the dataset.
- default_test_transform: Callable[[Any], Tensor] | None = ToTensor()#
The default transform to apply to the dataset.
- default_train_transform: Callable[[Any], Tensor] | None = ToTensor()#
The default transform to apply to the dataset.
- mean: Sequence[float] | None = [0.507, 0.487, 0.441]#
The mean of the features in the dataset used for normalization.
- num_classes: int = 20#
The number of classes in the dataset.
- shape: Sequence[int] = [3, 32, 32]#
The shape of each input example.
- std: Sequence[float] | None = [0.267, 0.256, 0.276]#
The standard deviation of the features in the dataset used for normalization.
- stream: Stream[LabeledInstance]#
Stream containing each task in sequence.
- task_mask: Tensor#
A mask for the output for each task of shape (num_tasks, num_classes)
- task_schedule: Sequence[Set[int]]#
A sequence of sets containing the classes for each task.
In online continual learning your learner may not have access to this attribute. It is provided for evaluation and debugging.