DownloadARFFGzip#

class capymoa.datasets.downloader.DownloadARFFGzip[source]#

__init__( directory: str = get_download_dir(), auto_download: bool = True, CLI: str | None = None, schema: str | None = None, )[source]#

Construct a Stream from a MOA stream object.

Usually, you will want to construct a Stream using the capymoa.stream.stream_from_file() function.

Parameters:

moa_stream – The MOA stream object to read instances from. Is None if the stream is created from a numpy array.
schema – The schema of the stream. If None, the schema is inferred from the moa_stream.
CLI – Additional command line arguments to pass to the MOA stream.

Raises:

__iter__() → Iterator[_AnyInstance][source]#

Get an iterator over the stream.

This will NOT restart the stream if it has already been iterated over. Please use the restart() method to restart the stream.

__next__() → _AnyInstance[source]#

Get the next instance in the stream.

download(working_directory: Path) → Path[source]#

Download the dataset and return the path to the downloaded dataset within the working directory.

Parameters:: working_directory – The directory to download the dataset to.
Returns:: The path to the downloaded dataset within the working directory.

extract(stream_archive: Path) → Path[source]#

Extract the dataset from the archive and return the path to the extracted dataset.

Parameters:: stream_archive – The path to the archive containing the dataset.
Returns:: The path to the extracted dataset.

get_moa_stream() → InstanceStream | None[source]#: Get the MOA stream object if it exists.

has_more_instances() → bool[source]#: Return True if the stream have more instances to read.

next_instance() → _AnyInstance[source]#

Return the next instance in the stream.

Raises:: ValueError – If the machine learning task is neither a regression nor a classification task.
Returns:: A labeled instances or a regression depending on the schema.

restart()[source]#: Restart the stream to read instances from the beginning.

to_stream(stream: Path) → Any[source]#

Convert the dataset to a MOA stream.