SGDRegressor#

class capymoa.regressor.SGDRegressor[source]#

Bases: SKRegressor

Streaming stochastic gradient descent regressor.

This wraps sklearn.linear_model.SGDRegressor for ease of use in the streaming context. Some options are missing because they are not relevant in the streaming context. Furthermore, the learning rate is constant.

Example Usage:

>>> from capymoa.datasets import Fried
>>> from capymoa.regressor import PassiveAggressiveRegressor
>>> from capymoa.evaluation import prequential_evaluation
>>> stream = Fried()
>>> schema = stream.get_schema()
>>> learner = SGDRegressor(schema)
>>> results = prequential_evaluation(stream, learner, max_instances=1000)
>>> results["cumulative"].rmse()
4.62...
sklearner: SGDRegressor#

The underlying scikit-learn object

__init__(
schema: Schema,
loss: Literal['squared_error', 'huber', 'epsilon_insensitive', 'squared_epsilon_insensitive'] = 'squared_error',
penalty: Literal['l2', 'l1', 'elasticnet'] | None = 'l2',
alpha: float = 0.0001,
l1_ratio: float = 0.15,
fit_intercept: bool = True,
epsilon: float = 0.1,
learning_rate: str = 'invscaling',
eta0: float = 0.01,
random_seed: int | None = None,
)[source]#

Construct stochastic gradient descent Regressor.

Parameters:
  • schema – Describes the datastream’s structure.

  • loss – The loss function to be used.

  • penalty – The penalty (aka regularization term) to be used.

  • alpha – Constant that multiplies the regularization term.

  • l1_ratio – The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. Only used if penalty is ‘elasticnet’. Values must be in the range [0.0, 1.0].

  • fit_intercept – Whether the intercept (bias) should be estimated or not. If False, the data is assumed to be already centered.

  • epsilon – Epsilon in the epsilon-insensitive loss functions; only if loss is ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’. For ‘huber’, determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.

  • learning_rate – The size of the gradient step.

  • eta0 – The initial learning rate for the ‘constant’, ‘invscaling’ or ‘adaptive’ schedules. The default value is 0.0 as eta0 is not used by the default schedule ‘optimal’.

  • random_seed – Seed for reproducibility.

predict(instance: Instance) float[source]#
train(instance: RegressionInstance)[source]#