Schema#
- class capymoa.stream.Schema[source]#
Bases:
object
Schema describes the structure of a stream.
It contains the attribute names, datatype, and the possible values for nominal attributes. The schema is crucial for a learner to know how to interpret instances correctly.
When working with datasets built into CapyMOA (see
capymoa.datasets
) and ARFF files, the schema is automatically created. However, in some cases you might want to create a schema manually. This can be done using thefrom_custom()
method.- __init__(
- moa_header: InstancesHeader,
Construct a schema by wrapping a
InstancesHeader
.To create a schema without an
InstancesHeader
usefrom_custom()
method.- Parameters:
moa_header – A Java MOA header object.
- get_value_for_index(y_index: int | None) str | None [source]#
Return the value for the class label index y_index.
- get_moa_header() InstancesHeader [source]#
Get the JAVA MOA header. Useful for advanced users.
This is needed for advanced operations that are not supported by the Python wrappers (yet).
- is_y_index_in_range(y_index: int) bool [source]#
Return True if the y_index is in the range of the class label indexes.
- property dataset_name: str#
Returns the name of the dataset.
- static from_custom(
- feature_names: Sequence[str],
- values_for_nominal_features: Dict[str, Sequence[str]] = {},
- values_for_class_label: Sequence[str] = None,
- dataset_name='No_Name',
- target_attribute_name=None,
- target_type=None,
Create a CapyMOA Schema that defines each attribute in the stream.
The following example shows how to use this method to create a classification schema:
>>> from capymoa.stream import Schema ... >>> Schema.from_custom( ... feature_names=["attrib_1", "attrib_2"], ... dataset_name="MyClassification", ... target_attribute_name="class", ... values_for_class_label=["yes", "no"]) @relation MyClassification @attribute attrib_1 numeric @attribute attrib_2 numeric @attribute class {yes,no} @data
The following example shows how to use this method to create a regression schema:
>>> Schema.from_custom( ... feature_names=["attrib_1", "attrib_2"], ... values_for_nominal_features={"attrib_1": ["a", "b"]}, ... dataset_name="MyRegression", ... target_attribute_name="target", ... target_type='numeric') @relation MyRegression @attribute attrib_1 {a,b} @attribute attrib_2 numeric @attribute target numeric @data
Sample code to get relevant information from two Numpy arrays: X[rows][features] and y[rows]
- Parameters:
feature_names – A list containing names of features. if none sets a default name.
values_for_nominal_features – Possible values of each nominal feature.
values_for_class_label – Possible values for class label. Values are turned into strings.
dataset_name – Name of the dataset. Default is “No_Name”.
target_attribute_name – Name of the target/class attribute. Default is None.
target_type – Set the target type as ‘categorical’ or ‘numeric’, None to detect automatically.
- Return CayMOA Schema:
Initialized CapyMOA Schema which contain all necessary attribute information for all features and the class label