L2P#
- class capymoa.ocl.strategy.l2p.L2P[source]#
Bases:
BatchClassifier,TrainTaskAwareLearning to Prompt.
Learning to Prompt (L2P) [1] is a continual learning strategy that leverages a pool of learnable prompts to adapt a pre-trained vision transformer (ViT) to new tasks. For each input, the most relevant prompts are selected from the pool based on the similarity between the input’s embedding and the prompt keys. The selected prompts are then used to condition the ViT, allowing it to effectively learn new tasks while mitigating catastrophic forgetting.
L2P relies on knowledge of the tasks during training to select task-specific prompts but does not require task information during inference.
# Please note this code block is not regularly tested. from capymoa.ocl.strategy.l2p import L2P from capymoa.ocl.datasets import SplitCIFAR100 from capymoa.ocl.evaluation import ocl_train_eval_loop scenario = SplitCIFAR100() learner = L2P(scenario.schema, scenario.task_mask, device="cuda") results = ocl_train_eval_loop( learner, scenario.train_loaders(32), scenario.test_loaders(32), progress_bar=True ) print(f"{results.accuracy_final*100:.1f}%")
- __init__(
- schema: Schema,
- task_mask: Tensor,
- vit: L2PViT | str = 'facebook/dinov2-small',
- prompts_per_task: int = 5,
- prompt_length: int = 1,
- top_k: int = 3,
- pull_constraint_coeff: float = 0.1,
- optimizer: Callable[[Any], Optimizer] = lambda params: ...,
- device: str = 'cpu',
- random_seed: int = 1,
Construct L2P learner.
- Parameters:
schema – Schema describing the datastream.
task_mask – A boolean tensor of shape (num_tasks, num_classes) indicating which classes belong to each task.
vit – Vision transformer backbone or the name of a pretrained model from HuggingFace Transformers. Requires transformers to be installed.
prompts_per_task – Number of prompts per task in the prompt pool.
prompt_length – Length of each prompt (number of tokens/patches).
top_k – Number of top prompts to retrieve per query.
pull_constraint_coeff – Coefficient for the pull constraint loss term.
optimizer – Function that takes model parameters and returns an optimizer instance.
device – Device to run the model on, e.g., “cpu” or “cuda”.
random_seed – Random seed for reproducibility.
logger – Optional logger for tracking training metrics.
- batch_predict_proba(x: Tensor) Tensor[source]#
Predict the probabilities of the classes for a batch of instances.
- predict(instance: Instance) int | None[source]#
Predict the label of an instance.
The base implementation calls
predict_proba()and returns the label with the highest probability.- Parameters:
instance – The instance to predict the label for.
- Returns:
The predicted label or
Noneif the classifier is unable to make a prediction.
- predict_proba(
- instance: Instance,
Calls
batch_predict_proba()with a batch of size 1.
- train(instance: LabeledInstance) None[source]#
Calls
batch_train()with a batch of size 1.
- random_seed: int#
The random seed for reproducibility.
When implementing a classifier ensure random number generators are seeded.