EpsilonGreedy#

class capymoa.automl.EpsilonGreedy[source]#

Bases: object

Epsilon-Greedy bandit policy for model selection.

This policy selects the best model with probability 1 - epsilon and explores other models with probability epsilon. During the burn-in period, it always explores to gather initial information about all models.

>>> from capymoa.automl import EpsilonGreedy
>>> policy = EpsilonGreedy(epsilon=0.1, burn_in=50)
>>> policy.epsilon
0.1

See also

BanditClassifier

__init__(epsilon: float = 0.1, burn_in: int = 100)[source]#

Construct a new Epsilon-Greedy policy.

Parameters:

epsilon – Probability of exploring a random model (default: 0.1).
burn_in – Number of initial rounds dedicated to exploration (default: 100).

get_arm_stats()[source]#: Get statistics about each arm’s performance.

get_best_arm_idx(available_arms)[source]#

initialize(n_arms)[source]#: Initialize the policy with a given number of arms.

pull(available_arms)[source]#: Select which arms to pull based on the epsilon-greedy policy.

update(arm, reward)[source]#: Update the policy with the observed reward for the pulled arm.

arm_counts#: Number of times each arm has been pulled.

arm_rewards#: Cumulative reward values for each model (arm).

burn_in#: Number of initial rounds where all models are explored to collect initial statistics.

epsilon#: Probability of exploring a random model.

n_arms#: Number of available models (arms).

total_pulls#: Total number of model selections performed.

EpsilonGreedy#

This Page