langfair.generator.generator.ResponseGenerator#

class langfair.generator.generator.ResponseGenerator(langchain_llm=None, suppressed_exceptions=None, max_calls_per_min=None)#

Bases: object

__init__(langchain_llm=None, suppressed_exceptions=None, max_calls_per_min=None)#

Class for generating data from a provided set of prompts

Parameters:
  • langchain_llm (langchain llm object, default=None) – A langchain llm object to get passed to chain constructor. User is responsible for specifying temperature and other relevant parameters to the constructor of their langchain_llm object.

  • suppressed_exceptions (tuple, default=None) – Specifies which exceptions to handle as ‘Unable to get response’ rather than raising the exception

  • max_calls_per_min (int, default=None) – [Deprecated] Use LangChain’s InMemoryRateLimiter instead.

Methods

__init__([langchain_llm, ...])

Class for generating data from a provided set of prompts

estimate_token_cost(tiktoken_model_name, prompts)

Estimates the token cost for a given list of prompts and (optionally) example responses.

generate_responses(prompts[, system_prompt, ...])

Generates evaluation dataset from a provided set of prompts.

async estimate_token_cost(tiktoken_model_name, prompts, example_responses=None, response_sample_size=30, system_prompt='You are a helpful assistant', count=25)#

Estimates the token cost for a given list of prompts and (optionally) example responses. Note: This method is only compatible with GPT models. Cost-per-token values are as of 10/21/2024.

Parameters:
  • tiktoken_model_name (str) – The name of the OpenAI model to use for token counting.

  • prompts (list of strings) – A list of prompts

  • example_responses (list of strings, default=None) – A list of example responses. If provided, the function will estimate the response tokens based on these examples

  • response_sample_size (int, default=30.) – The number of responses to generate for cost estimation if example_responses is not provided.

  • system_prompt (str, default="You are a helpful assistant.") – Specifies the system prompt used when generating LLM responses.

  • count (int, default=25) – The number of generations per prompt used when estimating cost.

Returns:

A dictionary containing the estimated token costs, including prompt token cost, completion token cost, and total token cost.

Return type:

dict

async generate_responses(prompts, system_prompt='You are a helpful assistant.', count=25)#

Generates evaluation dataset from a provided set of prompts. For each prompt, self.count responses are generated.

Parameters:
  • prompts (list of strings) – List of prompts from which LLM responses will be generated

  • system_prompt (str or None, default="You are a helpful assistant.") – Optional argument for user to provide custom system prompt

  • count (int, default=25) – Specifies number of responses to generate for each prompt. The convention is to use 25 generations per prompt in evaluating toxicity. See, for example DecodingTrust (https://arxiv.org/abs//2306.11698) or Gehman et al., 2020 (https://aclanthology.org/2020.findings-emnlp.301/).

Returns:

A dictionary with two keys: ‘data’ and ‘metadata’. ‘data’ : dict

A dictionary containing the prompts and responses. ‘prompt’ : list

A list of prompts.

’response’list

A list of responses corresponding to the prompts.

’metadata’dict

A dictionary containing metadata about the generation process. ‘non_completion_rate’ : float

The rate at which the generation process did not complete.

’temperature’float

The temperature parameter used in the generation process.

’count’int

The count of prompts used in the generation process.

’system_prompt’str

The system prompt used for generating responses

Return type:

dict