SGDClassifier#
- class capymoa.classifier.SGDClassifier[source]#
Bases:
SKClassifier
Streaming stochastic gradient descent classifier.
This wraps
SGDClassifier
for ease of use in the streaming context. Some options are missing because they are not relevant in the streaming context. Furthermore, the learning rate is constant.>>> from capymoa.datasets import ElectricityTiny >>> from capymoa.classifier import PassiveAggressiveClassifier >>> from capymoa.evaluation import prequential_evaluation >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = SGDClassifier(schema) >>> results = prequential_evaluation(stream, learner, max_instances=1000) >>> results["cumulative"].accuracy() 84.2
- __init__(
- schema: Schema,
- loss: Literal['hinge', 'log_loss', 'modified_huber', 'squared_hinge', 'perceptron', 'squared_error', 'huber', 'epsilon_insensitive', 'squared_epsilon_insensitive'] = 'hinge',
- penalty: Literal['l2', 'l1', 'easticnet'] = 'l2',
- alpha: float = 0.0001,
- l1_ratio: float = 0.15,
- fit_intercept: bool = True,
- epsilon: float = 0.1,
- n_jobs: int | None = None,
- learning_rate: Literal['constant', 'optimal', 'invscaling'] = 'optimal',
- eta0: float = 0.0,
- random_seed: int | None = None,
Construct stochastic gradient descent classifier.
- Parameters:
schema – Describes the datastream’s structure.
loss – The loss function to be used.
penalty – The penalty (aka regularization term) to be used.
alpha – Constant that multiplies the regularization term.
l1_ratio – The Elastic Net mixing parameter, with
0 <= l1_ratio <= 1
.l1_ratio=0
corresponds to L2 penalty,l1_ratio=1
to L1. Only used ifpenalty
is ‘elasticnet’. Values must be in the range[0.0, 1.0]
.fit_intercept – Whether the intercept (bias) should be estimated or not. If False, the data is assumed to be already centered.
epsilon – Epsilon in the epsilon-insensitive loss functions; only if
loss
is ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’. For ‘huber’, determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.n_jobs – The number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. Defaults to 1.
learning_rate – The size of the gradient step.
eta0 – The initial learning rate for the ‘constant’, ‘invscaling’ or ‘adaptive’ schedules. The default value is 0.0 as
eta0
is not used by the default schedule ‘optimal’.class_weight –
Weights associated with classes. If not given, all classes are supposed to have weight one.
The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as
n_samples / (n_classes * np.bincount(y))
.random_seed – Seed for reproducibility.
- predict(instance: Instance)[source]#
Predict the label of an instance.
The base implementation calls
predict_proba()
and returns the label with the highest probability.- Parameters:
instance – The instance to predict the label for.
- Returns:
The predicted label or
None
if the classifier is unable to make a prediction.
- predict_proba(instance: Instance)[source]#
Return probability estimates for each label.
- Parameters:
instance – The instance to estimate the probabilities for.
- Returns:
An array of probabilities for each label or
None
if the classifier is unable to make a prediction.
- train(instance: LabeledInstance)[source]#
Train the classifier with a labeled instance.
- Parameters:
instance – The labeled instance to train the classifier with.
- random_seed: int#
The random seed for reproducibility.
When implementing a classifier ensure random number generators are seeded.
- sklearner: SGDClassifier#
The underlying scikit-learn object