EpsilonGreedy#

class capymoa.automl.EpsilonGreedy[source]#

Bases: object

Epsilon-Greedy bandit policy for model selection.

This policy selects the best model with probability 1 - epsilon and explores other models with probability epsilon. During the burn-in period, it always explores to gather initial information about all models.

>>> from capymoa.automl import EpsilonGreedy
>>> policy = EpsilonGreedy(epsilon=0.1, burn_in=50)
>>> policy.epsilon
0.1

See also

BanditClassifier

__init__(epsilon: float = 0.1, burn_in: int = 100)[source]#

Construct a new Epsilon-Greedy policy.

Parameters:
  • epsilon – Probability of exploring a random model (default: 0.1).

  • burn_in – Number of initial rounds dedicated to exploration (default: 100).

get_arm_stats()[source]#

Get statistics about each arm’s performance.

get_best_arm_idx(available_arms)[source]#
initialize(n_arms)[source]#

Initialize the policy with a given number of arms.

pull(available_arms)[source]#

Select which arms to pull based on the epsilon-greedy policy.

update(arm, reward)[source]#

Update the policy with the observed reward for the pulled arm.

arm_counts#

Number of times each arm has been pulled.

arm_rewards#

Cumulative reward values for each model (arm).

burn_in#

Number of initial rounds where all models are explored to collect initial statistics.

epsilon#

Probability of exploring a random model.

n_arms#

Number of available models (arms).

total_pulls#

Total number of model selections performed.