RBFm_100k#
- class capymoa.datasets.RBFm_100k[source]#
Bases:
_DownloadableARFF
RBFm_100k is a synthetic classification problem based on the Radial Basis Function generator.
Number of instances: 100,000
Number of attributes: 10
generators.RandomRBFGeneratorDrift -s 1.0E-4 -c 5
This is a snapshot (100k instances) of the synthetic generator RBF (Radial Basis Function), which works as follows: A fixed number of random centroids are generated. Each center has a random position, a single standard deviation, class label and weight. New examples are generated by selecting a center at random, taking weights into consideration so that centers with higher weight are more likely to be chosen. A random direction is chosen to offset the attribute values from the central point. The length of the displacement is randomly drawn from a Gaussian distribution with standard deviation determined by the chosen centroid. The chosen centroid also determines the class label of the example. This effectively creates a normally distributed hypersphere of examples surrounding each central point with varying densities. Only numeric attributes are generated.
- __init__(
- directory: str | Path = get_download_dir(),
- auto_download: bool = True,
Setup a stream from an ARFF file and optionally download it if missing.
- Parameters:
directory – Where downloads are stored. Defaults to
capymoa.datasets.get_download_dir()
.auto_download – Download the dataset if it is missing.
- __iter__() Iterator[_AnyInstance] [source]#
Get an iterator over the stream.
This will NOT restart the stream if it has already been iterated over. Please use the
restart()
method to restart the stream.- Yield:
An iterator over the stream.
- __next__() _AnyInstance [source]#
Get the next instance in the stream.
- Returns:
The next instance in the stream.
- next_instance() _AnyInstance [source]#
Return the next instance in the stream.
- Raises:
ValueError – If the machine learning task is neither a regression nor a classification task.
- Returns:
A labeled instances or a regression depending on the schema.