Mean Token Negentropy#
mean_token_negentropy
Mean Token Negentropy (MTN) computes the entropy of each token using the top-K logprobs, transforms them to normalized negentropy scores, and averages these scores to obtain a confidence score for each response.
Definition#
This scorer requires accessing the top-K logprobs per token. Let the top-K token probabilities for token \(t_j\) be denoted as \(\{p_{t_{jk}}\}_{k=1}^K\).
We first define Top-K Token Entropy for token \(j\) as:
The Token Negentropy (TN) transformation normalizes entropy to a confidence score in \([0,1]\):
Finally, Mean Token Negentropy is the simple average across all tokens:
Key Properties:
Higher values indicate lower entropy (higher confidence)
Uses top-K logprobs to estimate uncertainty at each token position
Score range: \([0, 1]\)
How It Works#
Generate a response with top-K logprobs enabled
For each token position:
Compute the entropy across the top-K candidate tokens
Normalize to get a negentropy (confidence) score
Average the negentropy scores across all token positions
Parameters#
When using WhiteBoxUQ, specify "mean_token_negentropy" in the scorers list.
Note
This scorer requires the LLM to support returning top-K logprobs (e.g., OpenAI models with top_logprobs parameter).
Example#
from uqlm import WhiteBoxUQ
# Initialize with mean_token_negentropy scorer
wbuq = WhiteBoxUQ(
llm=llm,
scorers=["mean_token_negentropy"],
top_k_logprobs=15 # Number of top logprobs to use
)
# Generate responses and compute scores
results = await wbuq.generate_and_score(prompts=prompts)
# Access the mean_token_negentropy scores
print(results.to_df()["mean_token_negentropy"])
References#
Scalena, D., et al. (2025). TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness. arXiv.
Manakul, P., et al. (2023). SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models. arXiv.
Bouchard, D. & Chauhan, M. S. (2025). Generalized Ensembles for Robust Uncertainty Quantification of LLMs. arXiv.
See Also#
WhiteBoxUQ- Main class for white-box uncertainty quantificationMinimum Token Negentropy - Minimum negentropy across all tokens
Probability Margin - Difference between top-2 token probabilities