PassiveAggressiveClassifier#

class capymoa.classifier.PassiveAggressiveClassifier[source]#

Bases: SKClassifier

Streaming Passive Aggressive Classifier

This wraps sklearn.linear_model.PassiveAggressiveClassifier for ease of use in the streaming context. Some options are missing because they are not relevant in the streaming context.

Online Passive-Aggressive Algorithms K. Crammer, O. Dekel, J. Keshat, S. Shalev-Shwartz, Y. Singer - JMLR (2006)

>>> from capymoa.datasets import ElectricityTiny
>>> from capymoa.classifier import PassiveAggressiveClassifier
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = ElectricityTiny()
>>> schema = stream.get_schema()
>>> learner = PassiveAggressiveClassifier(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].accuracy()
84.3

__init__( schema: Schema, max_step_size: float = 1.0, fit_intercept: bool = True, loss: str = 'hinge', n_jobs: int | None = None, class_weight: Dict[int, float] | None | Literal['balanced'] = None, average: bool = False, random_seed=1, )[source]#

Construct a passive aggressive classifier.

Parameters:

schema – Stream schema
max_step_size – Maximum step size (regularization).
fit_intercept – Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.
loss – The loss function to be used: hinge: equivalent to PA-I in the reference paper. squared_hinge: equivalent to PA-II in the reference paper.
n_jobs – The number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.
class_weight –
Preset for the sklearner.class_weight fit parameter.

Weights associated with classes. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).
average – When set to True, computes the averaged SGD weights and stores the result in the sklearner.coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.
random_seed – Seed for the random number generator.

predict(instance: Instance)[source]#

Predict the label of an instance.

The base implementation calls predict_proba() and returns the label with the highest probability.

Parameters:: instance – The instance to predict the label for.
Returns:: The predicted label or None if the classifier is unable to make a prediction.

predict_proba(instance: Instance)[source]#

Return probability estimates for each label.

Parameters:: instance – The instance to estimate the probabilities for.
Returns:: An array of probabilities for each label or None if the classifier is unable to make a prediction.

train(instance: LabeledInstance)[source]#

Train the classifier with a labeled instance.

Parameters:: instance – The labeled instance to train the classifier with.

random_seed: int#

The random seed for reproducibility.

When implementing a classifier ensure random number generators are seeded.

schema: Schema#: The schema representing the instances.

sklearner: PassiveAggressiveClassifier#: The underlying scikit-learn object. See: sklearn.linear_model.PassiveAggressiveClassifier

PassiveAggressiveClassifier#

This Page