langfair.metrics.stereotype.stereotype.StereotypeMetrics#

class langfair.metrics.stereotype.stereotype.StereotypeMetrics(metrics=['Stereotype Association', 'Cooccurrence Bias', 'Stereotype Classifier'])#

Bases: object

__init__(metrics=['Stereotype Association', 'Cooccurrence Bias', 'Stereotype Classifier'])#

This class computes few or all Stereotype metrics supported langfair. For more information on these metrics, see Liang et al. (2023) [1], Bordia & Bowman (2019) [2] and Zekun et al. (2023) [3].

Parameters:: metrics (list of string/objects, default=["Stereotype Association", "Cooccurrence Bias", "Stereotype Classifier"]) – A list containing name or class object of metrics.

Methods

`__init__`([metrics])	This class computes few or all Stereotype metrics supported langfair.
`evaluate`(responses[, prompts, return_data, ...])	This method evaluate the stereotype metrics values for the provided pair of texts.

evaluate(responses, prompts=None, return_data=False, categories=['gender', 'race'])#

This method evaluate the stereotype metrics values for the provided pair of texts.

Parameters:

responses (list of strings) – A list of generated output from an LLM.
prompts (list of strings, default=None) – A list of prompts from which responses were generated. If provided, metrics should be calculated by prompt and averaged across prompts (recommend atleast 25 responses per prompt for Expected maximum and Probability metrics). Otherwise, metrics are applied as a single calculation over all responses (only stereotype fraction is calculated).
return_data (bool, default=False) – Specifies whether to include a dictionary containing response-level stereotype scores in returned result.
categories (list, subset of ['gender', 'race']) – Specifies attributes for stereotype classifier metrics. Includes both race and gender by default.

Returns:

Dictionary containing two keys: ‘metrics’, containing all metric values, and ‘data’, containing response-level stereotype scores.

Return type:

dict

References

[1]

Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, and Yuta Koreeda. Holistic evaluation of language models. 2023. URL: https://arxiv.org/abs/2211.09110, arXiv:2211.09110.

langfair.metrics.stereotype.stereotype.StereotypeMetrics#

This Page