uqlm.white_box.sampled_logprobs.SampledLogprobsScorer#

class uqlm.white_box.sampled_logprobs.SampledLogprobsScorer(scorers=['semantic_negentropy', 'semantic_density', 'monte_carlo_probability', 'consistency_and_confidence'], llm=None, nli_model_name='microsoft/deberta-large-mnli', max_length=2000, prompts_in_nli=True, length_normalize=True, device=None)#

Bases: LogprobsScorer

__init__(scorers=['semantic_negentropy', 'semantic_density', 'monte_carlo_probability', 'consistency_and_confidence'], llm=None, nli_model_name='microsoft/deberta-large-mnli', max_length=2000, prompts_in_nli=True, length_normalize=True, device=None)#

Initialize the SampledLogprobsScorer.

Parameters:

scorers (List[str], default=SAMPLED_LOGPROBS_SCORER_NAMES) – Specifies which scorers to compute. Must be a subset of [“semantic_negentropy”, “semantic_density”, “monte_carlo_probability”, “consistency_and_confidence”].
llm (BaseChatModel, default=None) – Specifies the LLM to use. Must be a BaseChatModel.
nli_model_name (str, default="microsoft/deberta-large-mnli") – Specifies which NLI model to use. Must be acceptable input to AutoTokenizer.from_pretrained() and AutoModelForSequenceClassification.from_pretrained()
max_length (int, default=2000) – Specifies the maximum allowed string length. Responses longer than this value will be truncated to avoid OutOfMemoryError
prompts_in_nli (bool, default=True) – Specifies whether to use the prompts in the NLI inputs for semantic entropy and semantic density scorers.
length_normalize (bool, default=True) – Specifies whether to length normalize the logprobs. This attribute affect the response probability computation for three scorers (semantic_negentropy, semantic_density, and monte_carlo_probability).
device (str or torch.device input or torch.device object, default="cpu") – Specifies the device that NLI model use for prediction. Only applies to ‘semantic_negentropy’, ‘semantic_density’ scorers. Pass a torch.device to leverage GPU.

Methods

`__init__`([scorers, llm, nli_model_name, ...])	Initialize the SampledLogprobsScorer.
`compute_consistency_confidence`(responses, ...)
`compute_semantic_density`(responses, ...[, ...])
`compute_semantic_negentropy`(responses, ...)
`evaluate`(responses, sampled_responses, ...)
`extract_logprobs`(single_response_logprobs)	Extract log probabilities from token data
`extract_probs`(single_response_logprobs)	Extract probabilities from token data
`extract_top_logprobs`(single_response_logprobs)	Extract top log probabilities for each token
`monte_carlo_probability`(responses, ...)

static extract_logprobs(single_response_logprobs)#

Extract log probabilities from token data

Return type:: ndarray

extract_probs(single_response_logprobs)#

Extract probabilities from token data

Return type:: ndarray

static extract_top_logprobs(single_response_logprobs)#

Extract top log probabilities for each token

Return type:: List[ndarray]

References

uqlm.white_box.sampled_logprobs.SampledLogprobsScorer#

This Page