datasets#

CapyMOA comes with some datasets ‘out of the box’. Simply import the dataset and start using it, the data will be downloaded automatically if it is not already present in the download directory. You can configure where the datasets are downloaded to by setting an environment variable (See capymoa.env)

>>> from capymoa.datasets import ElectricityTiny
>>> stream = ElectricityTiny()
>>> stream.next_instance().x
array([0.      , 0.056443, 0.439155, 0.003467, 0.422915, 0.414912])

Alternatively, you may download the datasets all at once with the command line interface provided by capymoa.datasets:

python -m capymoa.datasets --help

Modules#

downloader

Classes#

`Bike`	Bike is a regression dataset for the amount of bike share information.
`CovtFD`	CovtFD is an adaptation from the classic `Covtype` classification problem with added feature drifts.
`Covtype`	The classic covertype (/covtype) classification problem
`CovtypeNorm`	A normalized version of the classic `Covtype` classification problem.
`CovtypeTiny`	A truncated version of the classic `Covtype` classification problem.
`Electricity`	Electricity is a classification problem based on the Australian New South Wales Electricity Market.
`ElectricityTiny`	A truncated version of the Electricity dataset with 1000 instances.
`Fried`	Fried is a regression problem based on the Friedman dataset.
`FriedTiny`	A truncated version of the Friedman regression problem with 1000 instances.
`Hyper100k`	Hyper100k is a classification problem based on the moving hyperplane generator.
`RBFm_100k`	RBFm_100k is a synthetic classification problem based on the Radial Basis Function generator.
`RTG_2abrupt`	RTG_2abrupt is a synthetic classification problem based on the Random Tree generator with 2 abrupt drifts.
`Sensor`	Sensor stream is a classification problem based on indoor sensor data.

Functions#

capymoa.datasets.get_download_dir(download_dir: str | None = None) → Path[source]#

Get a directory where datasets should be downloaded to.

The download directory is determined by the following steps:

If the download_dir parameter is provided, use that.
If the CAPYMOA_DATASETS_DIR environment variable is set, use that.
Otherwise, use the default download directory: ./data.

Parameters:: download_dir – Override the download directory.
Returns:: The download directory.

datasets#

Modules#

Classes#

Functions#

This Page