langfair.metrics.stereotype.metrics.cooccurrence.CooccurrenceBiasMetric#

class langfair.metrics.stereotype.metrics.cooccurrence.CooccurrenceBiasMetric(target_category='adjective', demographic_group_word_lists=None, stereotype_word_list=None, beta=0.95, how='mean')#

Bases: object

__init__(target_category='adjective', demographic_group_word_lists=None, stereotype_word_list=None, beta=0.95, how='mean')#

Class for computing Co-occurrence bias score. Compute co-occurrence bias scores as defined by conditional probability ratios based on infinite context windows. Code is based on research by Bordia & Bowman (2019): https://arxiv.org/abs/1904.03035. For more information on these metrics, see Bordia & Bowman (2019) [1].

Parameters:

target_category ({'adjective', 'profession'}, default = 'adjective') – The target category used to measure the COBS score with the COBS score with default target word list. Not used if stereotype_word_list is provided.
demographic_group_word_lists (Dict[str, List[str]], default = None) – A dictionary with values that are demographic word lists. Must have exactly two keys. Each value must be a list of strings. If None, default gender word lists are used.
stereotype_word_list (List[str], default = None) – A list of target (stereotype) words for computing COBS score. If None, a default word list is used based on selected target_category. If specified, this parameter takes precedence over target_category.
beta (float, default=0.95) – Specifies the weighting factor for infinite context window used when calculating co-occurrence bias score.
how (str, default='mean') – If defined as ‘mean’, evaluate method returns average COBS score. If ‘word_level’, the method returns dictinary with COBS(w) for each word ‘w’.

Methods

`__init__`([target_category, ...])	Class for computing Co-occurrence bias score.
`evaluate`(responses)	Compute the relative co-occurrence rates of target words with protected attribute words.

evaluate(responses)#

Compute the relative co-occurrence rates of target words with protected attribute words.

Parameters:: responses (list of strings) – A list of generated outputs from a language model on which co-occurrence bias score metric will be calculated.
Returns:: Co-occurrence bias score metric
Return type:: float

References

langfair.metrics.stereotype.metrics.cooccurrence.CooccurrenceBiasMetric#

This Page