metrax.RougeN#

class metrax.RougeN(total_precision: Array, total_recall: Array, total_f1: Array, num_examples: Array, order: int)#

Bases: RougeBase

Computes macro-averaged ROUGE-N recall, precision, and F1-score.

This metric first calculates ROUGE-N precision, recall, and F1-score for each individual prediction compared against its single corresponding reference. ROUGE-N scores are based on the number of overlapping n-grams (sequences of n words) between the prediction and the reference text. These per-instance precision, recall, and F1-scores are then averaged across all instances in the dataset/batch.

How ROUGE-N scores are calculated for each individual prediction-reference pair:

\[\text{Precision} = \frac{N_o}{N_p}\]

\[\text{Recall} = \frac{N_o}{N_r}\]

\[\text{F1} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}\]

where:

\(N_o\) be the number of n-grams that overlap between the prediction

and the reference. - \(N_p\) be the total number of n-grams in the prediction. - \(N_r\) be the total number of n-grams in the reference.

Final Macro-Averaged Metrics:

\[\text{MacroAvgPrecision} = \frac{\text{total_precision}}{\text{num_examples}}\]

\[\text{MacroAvgRecall} = \frac{\text{total_recall}}{\text{num_examples}}\]

\[\text{MacroAvgF1} = \frac{\text{total_f1}}{\text{num_examples}}\]

order#

The specific ‘N’ in ROUGE-N (e.g., 1 for ROUGE-1, 2 for ROUGE-2).

Type:: int

total_precision#

Accumulated sum of precision scores from each instance.

Type:: jax.Array

total_recall#

Accumulated sum of recall scores from each instance.

Type:: jax.Array

total_f1#

Accumulated sum of f1 scores from each instance.

Type:: jax.Array

num_examples#

The number of instances (prediction-reference pairs) processed.

Type:: jax.Array

__init__(total_precision: Array, total_recall: Array, total_f1: Array, num_examples: Array, order: int) → None#

Methods

`__init__`(total_precision, total_recall, ...)
`compute`()	Computes macro-averaged recall, precision, and F1-score.
`compute_value`()	Wraps compute() and returns a values.Value.
`empty`([order])	Creates an empty Rouge metric.
`from_fun`(fun)	Calls cls.from_model_output with the return value from fun.
`from_model_output`(predictions, references, ...)	Computes sums of per-instance ROUGE scores for a batch.
`from_output`(name)	Calls cls.from_model_output with model output named name.
`merge`(other)	Merges this Rouge metric with another.
`reduce`()	Reduces the metric along it first axis by calling _reduce_merge().
`replace`(**updates)	Returns a new object replacing the specified fields with new values.

Attributes

`order`
`total_precision`
`total_recall`
`total_f1`
`num_examples`

order: int#

classmethod empty(order: int = 2) → RougeN#: Creates an empty Rouge metric. Implemented by subclasses.

__init__(total_precision: Array, total_recall: Array, total_f1: Array, num_examples: Array, order: int) → None#

replace(**updates)#: Returns a new object replacing the specified fields with new values.

metrax.RougeN

Contents

metrax.RougeN#