langfair.metrics.stereotype.metrics.cooccurrence.CooccurrenceBiasMetric#

class langfair.metrics.stereotype.metrics.cooccurrence.CooccurrenceBiasMetric(target_category='adjective', demographic_group_word_lists=None, stereotype_word_list=None, beta=0.95, how='mean')#

Bases: object

__init__(target_category='adjective', demographic_group_word_lists=None, stereotype_word_list=None, beta=0.95, how='mean')#

Class for computing Co-occurrence bias score. Compute co-occurrence bias scores as defined by conditional probability ratios based on infinite context windows. Code is based on research by Bordia & Bowman (2019): https://arxiv.org/abs/1904.03035 :type target_category: str :param target_category: The target category used to measure the COBS score with the COBS score with default

target word list. Not used if stereotype_word_list is provided.

Parameters:
  • demographic_group_word_lists (Dict[str, List[str]], default = None) – A dictionary with values that are demographic word lists. Must have exactly two keys. Each value must be a list of strings. If None, default gender word lists are used.

  • stereotype_word_list (List[str], default = None) – A list of target (stereotype) words for computing COBS score. If None, a default word list is used based on selected target_category. If specified, this parameter takes precedence over target_category.

  • beta (float, default=0.95) – Specifies the weighting factor for infinite context window used when calculating co-occurrence bias score.

  • how (str, default='mean') – If defined as ‘mean’, evaluate method returns average COBS score. If ‘word_level’, the method returns dictinary with COBS(w) for each word ‘w’.

Methods

__init__([target_category, ...])

Class for computing Co-occurrence bias score. Compute co-occurrence bias scores as defined by conditional probability ratios based on infinite context windows. Code is based on research by Bordia & Bowman (2019): https://arxiv.org/abs/1904.03035 :type target_category: str :param target_category: The target category used to measure the COBS score with the COBS score with default target word list. Not used if stereotype_word_list is provided. :type target_category: {'adjective', 'profession'}, default = 'adjective' :type demographic_group_word_lists: Dict[str, List[str]] :param demographic_group_word_lists: A dictionary with values that are demographic word lists. Must have exactly two keys. Each value must be a list of strings. If None, default gender word lists are used. :type demographic_group_word_lists: Dict[str, List[str]], default = None :type stereotype_word_list: List[str] :param stereotype_word_list: A list of target (stereotype) words for computing COBS score. If None, a default word list is used based on selected target_category. If specified, this parameter takes precedence over target_category. :type stereotype_word_list: List[str], default = None :type beta: float :param beta: Specifies the weighting factor for infinite context window used when calculating co-occurrence bias score. :type beta: float, default=0.95 :type how: str :param how: If defined as 'mean', evaluate method returns average COBS score. If 'word_level', the method returns dictinary with COBS(w) for each word 'w'. :type how: str, default='mean'.

evaluate(responses)

Compute the relative co-occurrence rates of target words with protected attribute words.

evaluate(responses)#

Compute the relative co-occurrence rates of target words with protected attribute words.

Parameters:

responses (list of strings) – A list of generated outputs from a language model on which co-occurrence bias score metric will be calculated.

Returns:

Co-occurrence bias score metric

Return type:

float