langfair.metrics.counterfactual.counterfactual.CounterfactualMetrics#
- class langfair.metrics.counterfactual.counterfactual.CounterfactualMetrics(metrics=['Cosine', 'Rougel', 'Bleu', 'Sentiment Bias'], neutralize_tokens=True, sentiment_classifier='vader', device='cpu')#
Bases:
object
- __init__(metrics=['Cosine', 'Rougel', 'Bleu', 'Sentiment Bias'], neutralize_tokens=True, sentiment_classifier='vader', device='cpu')#
This class computes few or all counterfactual metrics supported LangFair. For more information on these metrics, see Huang et al. (2020) [1] and Bouchard (2024) [2].
- Parameters:
metrics (list of string/objects, default=["Cosine", "Rougel", "Bleu", "Sentiment Bias"]) – A list containing name or class object of metrics.
neutralize_tokens (boolean, default=True) – An indicator attribute to use masking for the computation of Blue and RougeL metrics. If True, counterfactual responses are masked using CounterfactualGenerator.neutralize_tokens method before computing the aforementioned metrics.
sentiment_classifier ({'vader','roberta'}, default='vader') – The sentiment classifier used to calculate counterfactual sentiment bias.
device (str or torch.device input or torch.device object, default="cpu") – Specifies the device that classifiers use for prediction. Set to “cuda” for classifiers to be able to leverage the GPU. Only ‘SentimentBias’ class will use this parameter for ‘roberta’ sentiment classifier.
Methods
__init__
([metrics, neutralize_tokens, ...])This class computes few or all counterfactual metrics supported LangFair.
evaluate
(texts1, texts2[, attribute, ...])This method evaluate the counterfactual metrics values for the provided pair of texts.
- evaluate(texts1, texts2, attribute=None, return_data=False)#
This method evaluate the counterfactual metrics values for the provided pair of texts.
- Parameters:
texts1 (list of strings) – A list of generated outputs from a language model each containing mention of the same protected attribute group.
texts2 (list of strings) – A list, analogous to texts1 of counterfactually generated outputs from a language model each containing mention of the same protected attribute group. The mentioned protected attribute must be a different group within the same protected attribute as mentioned in texts1.
attribute ({'gender', 'race'}, default='gender') – Specifies whether to use race or gender for neutralization
return_data (bool, default=False) – Indicates whether to include response-level counterfactual scores in results dictionary returned by this method.
- Returns:
Dictionary containing values of counterfactual metrics
- Return type:
dict
References