Hyper100k#
- class capymoa.datasets.Hyper100k[source]#
Bases:
_DownloadableARFF
Hyper100k is a classification problem based on the moving hyperplane generator.
Number of instances: 100,000
Number of attributes: 10
Number of classes: 2
References:
Hulten, Geoff, Laurie Spencer, and Pedro Domingos. “Mining time-changing data streams.” Proceedings of the seventh ACM SIGKDD international conference son Knowledge discovery and data mining. 2001.
- __init__(
- directory: str | Path = get_download_dir(),
- auto_download: bool = True,
Setup a stream from an ARFF file and optionally download it if missing.
- Parameters:
directory – Where downloads are stored. Defaults to
capymoa.datasets.get_download_dir()
.auto_download – Download the dataset if it is missing.
- __iter__() Iterator[_AnyInstance] [source]#
Get an iterator over the stream.
This will NOT restart the stream if it has already been iterated over. Please use the
restart()
method to restart the stream.- Yield:
An iterator over the stream.
- __next__() _AnyInstance [source]#
Get the next instance in the stream.
- Returns:
The next instance in the stream.
- next_instance() _AnyInstance [source]#
Return the next instance in the stream.
- Raises:
ValueError – If the machine learning task is neither a regression nor a classification task.
- Returns:
A labeled instances or a regression depending on the schema.