Schema#
- class capymoa.stream.Schema[source]#
Bases:
objectSchema describes the structure of a stream.
It contains the attribute names, datatype, and the possible values for nominal attributes. The schema is crucial for a learner to know how to interpret instances correctly.
When working with datasets built into CapyMOA (see
capymoa.datasets) and ARFF files, the schema is automatically created. However, in some cases you might want to create a schema manually. This can be done using thefrom_custom()method.- __init__(
- moa_header: InstancesHeader,
Construct a schema by wrapping a
InstancesHeader.To create a schema without an
InstancesHeaderusefrom_custom()method.- Parameters:
moa_header – A Java MOA header object.
- static from_custom(
- feature_names: Sequence[str],
- values_for_nominal_features: Dict[str, Sequence[str]] = {},
- values_for_class_label: Sequence[str] = None,
- dataset_name='No_Name',
- target_attribute_name=None,
- target_type=None,
Create a CapyMOA Schema that defines each attribute in the stream.
The following example shows how to use this method to create a classification schema:
>>> from capymoa.stream import Schema ... >>> Schema.from_custom( ... feature_names=["attrib_1", "attrib_2"], ... dataset_name="MyClassification", ... target_attribute_name="class", ... values_for_class_label=["yes", "no"]) @relation MyClassification @attribute attrib_1 numeric @attribute attrib_2 numeric @attribute class {yes,no} @data
The following example shows how to use this method to create a regression schema:
>>> Schema.from_custom( ... feature_names=["attrib_1", "attrib_2"], ... values_for_nominal_features={"attrib_1": ["a", "b"]}, ... dataset_name="MyRegression", ... target_attribute_name="target", ... target_type='numeric') @relation MyRegression @attribute attrib_1 {a,b} @attribute attrib_2 numeric @attribute target numeric @data
Sample code to get relevant information from two Numpy arrays: X[rows][features] and y[rows]
- Parameters:
feature_names – A list containing names of features. if none sets a default name.
values_for_nominal_features – Possible values of each nominal feature.
values_for_class_label – Possible values for class label. Values are turned into strings.
dataset_name – Name of the dataset. Default is “No_Name”.
target_attribute_name – Name of the target/class attribute. Default is None.
target_type – Set the target type as ‘categorical’ or ‘numeric’, None to detect automatically.
- Return CayMOA Schema:
Initialized CapyMOA Schema which contain all necessary attribute information for all features and the class label
- get_moa_header() InstancesHeader[source]#
Get the JAVA MOA header. Useful for advanced users.
This is needed for advanced operations that are not supported by the Python wrappers (yet).
- get_value_for_index(y_index: int | None) str | None[source]#
Return the value for the class label index y_index.
- is_y_index_in_range(y_index: int) bool[source]#
Return True if the y_index is in the range of the class label indexes.
- property dataset_name: str#
Returns the name of the dataset.
- property shape: Sequence[int]#
The shape of the input
xinstances.Usually
capymoa.instance.Instance.xis a vector but some learners need to know the shape of the input. For example, a CNN needs to know the height and width of an image.