uqlm.nli.entailment.EntailmentClassifier#

class uqlm.nli.entailment.EntailmentClassifier(nli_llm=None)#

Bases: object

__init__(nli_llm=None)#

A class to compute NLI predictions.

Parameters:

nli_llm (BaseChatModel, default=None) – A LangChain chat model for LLM-based NLI inference. If provided, takes precedence over nli_model_name.

Methods

__init__([nli_llm])

A class to compute NLI predictions.

evaluate_claim_entailment(response_sets, ...)

Implements self.judge_entailment for claim-response pairs and reformats result in List[np.array]

judge_entailment(premises, hypotheses[, ...])

Async version of predict() for single NLI prediction.

async evaluate_claim_entailment(response_sets, claim_sets, retries=5, progress_bar=None)#

Implements self.judge_entailment for claim-response pairs and reformats result in List[np.array]

Return type:

List[ndarray]

Parameters:
  • response_sets (List[List[str]]) – The premise texts for NLI classification.

  • claim_sets (List[List[str]]) – The hypothesis texts for NLI classification.

  • retries (int, default=5) – Number of times to retry for failed score extraction

  • progress_bar (rich.progress.Progress, default=None) – If provided, displays a progress bar while scoring responses

Returns:

Entailment and contradiction scores

Return type:

List[np.ndarray]

async judge_entailment(premises, hypotheses, retries=5, progress_bar=None)#

Async version of predict() for single NLI prediction.

This method computes NLI predictions on the provided inputs asynchronously. For LangChain models, this enables concurrent LLM calls which significantly improves performance. For HuggingFace models, this wraps the synchronous call for API consistency.

Return type:

Dict[str, Any]

Parameters:
  • premises (List[str]) – The premise texts for NLI classification.

  • claims (List[str]) – The hypothesis texts for NLI classification.

  • retries (int, default=5) – Number of times to retry for failed score extraction

  • progress_bar (rich.progress.Progress, default=None) – If provided, displays a progress bar while scoring responses

Returns:

The entailment prompts, raw LLM outputs, and extracted entailment/contradiction scores

Return type:

Dict[str, Any]

References