White-Box Scorers#
White-box Uncertainty Quantification (UQ) methods leverage token probabilities to estimate uncertainty. These scorers offer single-generation scoring, which is significantly faster and cheaper than black-box methods, but require access to the LLM’s internal probabilities.
Key Characteristics:
Minimal Latency: Token probabilities are already returned by the LLM
No Added Cost: Doesn’t require additional LLM calls (for single-generation scorers)
High Performance: Access to internal model states provides rich uncertainty signals
Trade-offs:
Limited Compatibility: Requires access to token probabilities, not available for all LLMs/APIs
Notation:
Let the tokenization of LLM response \(y_i\) be denoted as \(\{t_1,...,t_{L_i}\}\), where \(L_i\) denotes the number of tokens in the response. Let \(p_t\) denote the token probability for token \(t\).
Single-Generation Scorers#
These scorers require only one LLM generation and use the token probabilities from that single response.
Multi-Generation Scorers#
These scorers generate multiple responses from the same prompt, combining the sampling approach of black-box UQ with token-probability-based signals.