Sequence Probability#

sequence_probability

Sequence Probability (SP) computes the joint probability of all tokens in the generated response.

Definition#

Sequence probability is the joint probability of all tokens:

\[SP(y_i) = \prod_{t \in y_i} p_t\]

where \(p_t\) denotes the token probability for token \(t\).

Key Properties:

  • Direct measure of how likely the model considers its own output

  • Not length-normalized, so tends to decrease with longer responses

  • Score range: \([0, 1]\) but typically very small for longer sequences

How It Works#

  1. Generate a response with logprobs enabled

  2. Extract the probability for each token in the response

  3. Multiply all token probabilities together

Note that due to the multiplicative nature, sequence probability decreases rapidly with response length. For length-invariant scoring, consider Length-Normalized Sequence Probability.

Parameters#

When using WhiteBoxUQ, specify "sequence_probability" in the scorers list.

Example#

from uqlm import WhiteBoxUQ

# Initialize with sequence_probability scorer
wbuq = WhiteBoxUQ(
    llm=llm,
    scorers=["sequence_probability"]
)

# Generate responses and compute scores
results = await wbuq.generate_and_score(prompts=prompts)

# Access the sequence_probability scores
print(results.to_df()["sequence_probability"])

References#

See Also#