SGDClassifier#
- class capymoa.classifier.SGDClassifier[source]#
Bases:
SKClassifier
Streaming stochastic gradient descent classifier.
This wraps
sklearn.linear_model.SGDClassifier
for ease of use in the streaming context. Some options are missing because they are not relevant in the streaming context. Furthermore, the learning rate is constant.>>> from capymoa.datasets import ElectricityTiny >>> from capymoa.classifier import PassiveAggressiveClassifier >>> from capymoa.evaluation import prequential_evaluation >>> stream = ElectricityTiny() >>> schema = stream.get_schema() >>> learner = SGDClassifier(schema) >>> results = prequential_evaluation(stream, learner, max_instances=1000) >>> results["cumulative"].accuracy() 84.2
- sklearner: SGDClassifier#
The underlying scikit-learn object
- __init__(
- schema: Schema,
- loss: Literal['hinge', 'log_loss', 'modified_huber', 'squared_hinge', 'perceptron', 'squared_error', 'huber', 'epsilon_insensitive', 'squared_epsilon_insensitive'] = 'hinge',
- penalty: Literal['l2', 'l1', 'easticnet'] = 'l2',
- alpha: float = 0.0001,
- l1_ratio: float = 0.15,
- fit_intercept: bool = True,
- epsilon: float = 0.1,
- n_jobs: int | None = None,
- learning_rate: Literal['constant', 'optimal', 'invscaling'] = 'optimal',
- eta0: float = 0.0,
- random_seed: int | None = None,
Construct stochastic gradient descent classifier.
- Parameters:
schema – Describes the datastream’s structure.
loss – The loss function to be used.
penalty – The penalty (aka regularization term) to be used.
alpha – Constant that multiplies the regularization term.
l1_ratio – The Elastic Net mixing parameter, with
0 <= l1_ratio <= 1
.l1_ratio=0
corresponds to L2 penalty,l1_ratio=1
to L1. Only used ifpenalty
is ‘elasticnet’. Values must be in the range[0.0, 1.0]
.fit_intercept – Whether the intercept (bias) should be estimated or not. If False, the data is assumed to be already centered.
epsilon – Epsilon in the epsilon-insensitive loss functions; only if
loss
is ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’. For ‘huber’, determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.n_jobs – The number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. Defaults to 1.
learning_rate – The size of the gradient step.
eta0 – The initial learning rate for the ‘constant’, ‘invscaling’ or ‘adaptive’ schedules. The default value is 0.0 as
eta0
is not used by the default schedule ‘optimal’.class_weight –
Weights associated with classes. If not given, all classes are supposed to have weight one.
The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as
n_samples / (n_classes * np.bincount(y))
.random_seed – Seed for reproducibility.
- train(instance: LabeledInstance)[source]#