OCLMetrics#

class capymoa.ocl.evaluation.OCLMetrics[source]#

Bases: object

A collection of metrics evaluating an online continual learner.

We define some metrics in terms of a matrix $R\in\mathbb{R}^{T \times T}$ (accuracy_matrix) where each element $R_{i,j}$ contains the the test accuracy on task $j$ after sequentially training on tasks $1$ through $i$.

Online learning make predictions continuously during training, so we also provide “anytime” versions of the metrics. These metrics are collected periodically during training. Specifically, $H$ times per task. The results of this evaluation are stored in a matrix $A\in\mathbb{R}^{T \times H \times T}$ (anytime_accuracy_matrix) where each element $A_{i,h,j}$ contains the test accuracy on task $j$ after sequentially training on tasks $1$ through $i-1$ and step $h$ of task $i$.

__init__( anytime_accuracy_all: ndarray, anytime_accuracy_all_avg: float, anytime_accuracy_seen: ndarray, anytime_accuracy_seen_avg: float, anytime_task_index: ndarray, accuracy_all: ndarray, accuracy_all_avg: float, accuracy_seen: ndarray, accuracy_seen_avg: float, accuracy_final: float, task_index: ndarray, forward_transfer: float, backward_transfer: float, accuracy_matrix: ndarray, class_cm: ndarray, anytime_accuracy_matrix: ndarray, n_classes: int, n_tasks: int, n_continual_evaluations: int, ttt: PrequentialResults, boundaries: ndarray, ttt_windowed_task_index: ndarray, ) → None#

accuracy_all: ndarray#

The accuracy on all tasks after training on each task.

Is a ndarray of shape (n_tasks,), dtype=np.float32

\[a_\text{all}(t) = \frac{1}{T} \sum_{i=1}^{T} R_{t,i}\]

Use task_index to get the corresponding task index for plotting.

accuracy_all_avg: float#: The average of accuracy_all over all tasks.

\[\bar{a}_\text{all} = \frac{1}{T}\sum_{t=1}^T a_\text{all}(t)\]

accuracy_final: float#: The accuracy on all tasks after training on the final task.

\[a_\text{final} = a_\text{all}(T)\]

accuracy_matrix: ndarray#

A matrix measuring the accuracy on each task after training on each task.

Is a ndarray of shape (n_tasks, n_tasks), dtype=np.float32.

R[i, j] is the accuracy on task $j$ after training on tasks $1$ through $i$.

accuracy_seen: ndarray#

The accuracy on seen tasks after training on each task.

Is a ndarray of shape (n_tasks,), dtype=np.float32.

\[a_\text{seen}(t) = \frac{1}{t}\sum^t_{i=1} R_{t,i}\]

Use task_index to get the corresponding task index for plotting.

accuracy_seen_avg: float#: The average of accuracy_seen over all tasks.

\[\bar{a}_\text{seen} = \frac{1}{T}\sum_{t=1}^T a_\text{seen}(t)\]

anytime_accuracy_all: ndarray#

The accuracy on all tasks after training on each step in each task.

Is a ndarray of shape (n_tasks * n_continual_evaluations,), dtype=np.float32.

\[a_\text{any all}(t, h) = \frac{1}{T}\sum^T_{i=1} A_{t,h,i}\]

We flatten the $t,h$ dimensions to a 1D array. Use anytime_task_index to get the corresponding task index for plotting.

anytime_accuracy_all_avg: float#: The average of anytime_accuracy_all over all tasks.

\[\bar{a}_\text{any all} = \frac{1}{T}\sum_{t=1}^T \frac{1}{H}\sum_{h=1}^H a_\text{any all}(t, h)\]

anytime_accuracy_matrix: ndarray#

A matrix measuring the accuracy on each task after training on each task and step.

Is a ndarray of shape (n_tasks * n_continual_evaluations, n_tasks), dtype=np.float32.

This matrix is $A$ with the first two dimensions flattened to a 2D array.

anytime_accuracy_seen: ndarray#

The accuracy on seen tasks after training on each step in each task.

\[a_\text{any seen}(t, h) = \frac{1}{t}\sum^t_{i=1} A_{t,h,i}\]

We flatten the $t,h$ dimensions to a 1D array. Use anytime_task_index to get the corresponding task index for plotting.

anytime_accuracy_seen_avg: float#: The average of anytime_accuracy_seen over all tasks.

\[\bar{a}_\text{any seen} = \frac{1}{T}\sum_{t=1}^T \frac{1}{H}\sum_{h=1}^H a_\text{any seen}(t, h)\]

anytime_task_index: ndarray#

The position in each task where the anytime accuracy was measured.

Is a ndarray of shape (n_tasks * n_continual_evaluations,), dtype=np.integer.

backward_transfer: float#: A scalar measuring the impact learning had on past tasks.

\[r_\text{BWT} = \frac{2}{T(T-1)} \sum_{i=2}^{T} \sum_{j=1}^{i-1} (R_{i,j} - R_{j,j})\]

boundaries: ndarray#

Instance index for the boundaries.

Used to map online evaluation to specific tasks.

Is a ndarray of shape (n_tasks + 1,), dtype=np.integer.

class_cm: ndarray#: A confusion matrix of shape (task, true_class, predicted_class).

forward_transfer: float#: A scalar measuring the impact learning had on future tasks.

\[r_\text{FWT} = \frac{2}{T(T-1)}\sum_{i=1}^{T} \sum_{j=i+1}^{T} R_{i,j}\]

n_classes: int#: The number of classes $C$.

n_continual_evaluations: int#: The number of continual evaluations per task $H$.

n_tasks: int#: The number of tasks $T$.

task_index: ndarray#: The position of each task in the metrics.

ttt: PrequentialResults#: Test-then-train/prequential results.

ttt_windowed_task_index: ndarray#

The position of each window within each task.

Useful as the x axis for capymoa.evaluation.results.PrequentialResults.windowed.

OCLMetrics#

This Page