Electricity#

class capymoa.datasets.Electricity[source]#

Bases: DownloadARFFGzip

Electricity is a classification problem based on the Australian New South Wales Electricity Market.

  • Number of instances: 45,312

  • Number of attributes: 8

  • Number of classes: 2 (UP, DOWN)

The Electricity data set was collected from the Australian New South Wales Electricity Market, where prices are not fixed. It was described by M. Harries and analysed by Gama. These prices are affected by demand and supply of the market itself and set every five minutes. The Electricity data set contains 45,312 instances, where class labels identify the changes of the price (2 possible classes: up or down) relative to a moving average of the last 24 hours. An important aspect of this data set is that it exhibits temporal dependencies. This version of the dataset has been normalised (AKA elecNormNew) and it is the one most commonly used in benchmarks.

References:

  1. https://sourceforge.net/projects/moa-datastream/files/Datasets/Classification/elecNormNew.arff.zip/download/

CLI_help() str[source]#

Return cli help string for the stream.

__init__(
directory: str = PosixPath('data'),
auto_download: bool = True,
CLI: str | None = None,
schema: str | None = None,
)[source]#
download(working_directory: Path) Path[source]#

Download the dataset and return the path to the downloaded dataset within the working directory.

Parameters:

working_directory – The directory to download the dataset to.

Returns:

The path to the downloaded dataset within the working directory.

extract(stream_archive: Path) Path[source]#

Extract the dataset from the archive and return the path to the extracted dataset.

Parameters:

stream_archive – The path to the archive containing the dataset.

Returns:

The path to the extracted dataset.

get_moa_stream() InstanceStream | None[source]#

Get the MOA stream object if it exists.

get_path()[source]#
get_schema() Schema[source]#

Return the schema of the stream.

has_more_instances() bool[source]#

Return True if the stream have more instances to read.

next_instance() LabeledInstance | RegressionInstance[source]#

Return the next instance in the stream.

Raises:

ValueError – If the machine learning task is neither a regression nor a classification task.

Returns:

A labeled instances or a regression depending on the schema.

restart()[source]#

Restart the stream to read instances from the beginning.

to_stream(stream: Path) Any[source]#

Convert the dataset to a MOA stream.

Parameters:

stream – The path to the dataset.

Returns:

A MOA stream.