Hyper100k#

class capymoa.datasets.Hyper100k[source]#

Bases: _DownloadableARFF

Hyper100k is a classification problem based on the moving hyperplane generator.

References:

Hulten, Geoff, Laurie Spencer, and Pedro Domingos. “Mining time-changing data streams.” Proceedings of the seventh ACM SIGKDD international conference son Knowledge discovery and data mining. 2001.

__init__( directory: str | Path = get_download_dir(), auto_download: bool = True, )[source]#

Setup a stream from an ARFF file and optionally download it if missing.

Parameters:

directory – Where downloads are stored. Defaults to capymoa.datasets.get_download_dir().
auto_download – Download the dataset if it is missing.

__iter__() → Iterator[_AnyInstance][source]#

Get an iterator over the stream.

This will NOT restart the stream if it has already been iterated over. Please use the restart() method to restart the stream.

__next__() → _AnyInstance[source]#

Get the next instance in the stream.

get_moa_stream() → InstanceStream | None[source]#: Get the MOA stream object if it exists.

has_more_instances() → bool[source]#: Return True if the stream have more instances to read.

next_instance() → _AnyInstance[source]#

Return the next instance in the stream.

Raises:: ValueError – If the machine learning task is neither a regression nor a classification task.
Returns:: A labeled instances or a regression depending on the schema.

restart()[source]#: Restart the stream to read instances from the beginning.

classmethod to_stream(path: Path) → InstanceStream[source]#: Convert the downloaded and unpacked dataset into a datastream.