{ "cells": [ { "cell_type": "markdown", "id": "a0428976-96c2-4898-9020-1eb2be430cee", "metadata": {}, "source": [ "# Demo of `ResponseGenerator` class" ] }, { "cell_type": "markdown", "id": "7080a2b7-9882-4b09-9bcd-9f7d3d7d80e9", "metadata": {}, "source": [ "Import necessary libraries for the notebook." ] }, { "cell_type": "code", "execution_count": 1, "id": "ad065d4f-b6c6-4e4b-951a-a3e9858d776a", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Run if python-dotenv not installed\n", "# import sys\n", "# !{sys.executable} -m pip install python-dotenv\n", "\n", "import os\n", "import time\n", "\n", "import openai\n", "import pandas as pd\n", "from dotenv import load_dotenv\n", "\n", "from langfair.generator import ResponseGenerator" ] }, { "cell_type": "code", "execution_count": 2, "id": "203f3ef4-c8c1-483f-beff-e83c378d4178", "metadata": { "tags": [] }, "outputs": [], "source": [ "# User to populate .env file with API credentials\n", "repo_path = '/'.join(os.getcwd().split('/')[:-2])\n", "load_dotenv(os.path.join(repo_path, '.env'))\n", "\n", "API_KEY = os.getenv('API_KEY')\n", "API_BASE = os.getenv('API_BASE')\n", "API_TYPE = os.getenv('API_TYPE')\n", "API_VERSION = os.getenv('API_VERSION')\n", "MODEL_VERSION = os.getenv('MODEL_VERSION')\n", "DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME')" ] }, { "cell_type": "markdown", "id": "e5e2877b-688f-4799-9c31-4a9014f27fad", "metadata": {}, "source": [ "Read in prompts from which responses will be generated." ] }, { "cell_type": "code", "execution_count": null, "id": "6411ef21-eef1-49b9-a58a-e183736c0111", "metadata": { "tags": [] }, "outputs": [], "source": [ "# THIS IS AN EXAMPLE SET OF PROMPTS. USER TO REPLACE WITH THEIR OWN PROMPTS\n", "from langfair.utils.dataloader import load_realtoxicity\n", "\n", "prompts = load_realtoxicity(n=10)\n", "print(f\"\\nExample prompt\\n{'-'*14}\\n'{prompts[0]}'\")" ] }, { "cell_type": "markdown", "id": "3fb35e72-d9bb-423d-b7a4-942de0b1d2dd", "metadata": { "tags": [] }, "source": [ "`ResponseGenerator()` - **Class for generating data for evaluation from provided set of prompts (class)**\n", "\n", "**Class parameters:**\n", "\n", "- `langchain_llm` (**langchain llm (Runnable), default=None**) A langchain llm object to get passed to LLMChain `llm` argument.\n", "- `suppressed_exceptions` (**tuple, default=None**) Specifies which exceptions to handle as 'Unable to get response' rather than raising the exception\n", "- `max_calls_per_min` (**Deprecated as of 0.2.0**) Use LangChain's InMemoryRateLimiter instead." ] }, { "cell_type": "markdown", "id": "076c34e7-cd90-42ba-ac1c-6f7a027e8549", "metadata": {}, "source": [ "Below we use LangFair's `ResponseGenerator` class to generate LLM responses. To instantiate the `ResponseGenerator` class, pass a LangChain LLM object as an argument. Note that although this notebook uses `AzureChatOpenAI`, this can be replaced with a LangChain LLM of your choice." ] }, { "cell_type": "code", "execution_count": 4, "id": "febe56d2-1cb1-4712-bf56-f00f62b20ba2", "metadata": { "tags": [] }, "outputs": [], "source": [ "# # Run if langchain-openai not installed \n", "# import sys\n", "# !{sys.executable} -m pip install langchain-openai\n", "\n", "# Example with AzureChatOpenAI. REPLACE WITH YOUR LLM OF CHOICE.\n", "from langchain_openai import AzureChatOpenAI\n", "\n", "llm = AzureChatOpenAI(\n", " deployment_name=DEPLOYMENT_NAME,\n", " openai_api_key=API_KEY,\n", " azure_endpoint=API_BASE,\n", " openai_api_type=API_TYPE,\n", " openai_api_version=API_VERSION,\n", " temperature=1 # User to set temperature\n", ")" ] }, { "cell_type": "code", "execution_count": 5, "id": "72dde77b-b9f1-4eb9-8748-8aac1e90819c", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Create langfair ResponseGenerator object\n", "rg = ResponseGenerator(\n", " langchain_llm=llm, \n", " suppressed_exceptions=(openai.BadRequestError, ValueError), # this suppresses content filtering errors\n", ")" ] }, { "cell_type": "markdown", "id": "f0b0372f-cc8c-4ab1-b7e1-4ebe5e4ebb50", "metadata": {}, "source": [ "### Estimate token costs before generation" ] }, { "cell_type": "markdown", "id": "49460d44-42b8-4e5e-918d-eff1e34ed1a5", "metadata": {}, "source": [ " `estimate_token_cost()` - Estimates the token cost for a given list of prompts and (optionally) example responses. This method is only compatible with GPT models.\n", " \n", "###### Method Parameters:\n", "\n", "- `prompts` - (**list of strings**) A list of prompts.\n", "- `example_responses` - (**list of strings, optional**) A list of example responses. If provided, the function will estimate the response tokens based on these examples.\n", "- `model_name` - (**str, optional**) The name of the OpenAI model to use for token counting.\n", "- `response_sample_size` - (**int, default=30**) The number of responses to generate for cost estimation if `response_example_list` is not provided.\n", "- `system_prompt` - (**str, default=\"You are a helpful assistant.\"**) The system prompt to use.\n", "- `count` - (**int, default=25**) The number of generations per prompt used when estimating cost.\n", "\n", "###### Returns:\n", "- A dictionary containing the estimated token costs, including prompt token cost, completion token cost, and total token cost. (**dictionary**)" ] }, { "cell_type": "code", "execution_count": 6, "id": "d860cb28-50db-4b69-b222-782488bfc0fb", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Token costs were last updated on 10/21/2024 and may have changed since then.\n", "Estimating cost based on 1 generations per prompt...\n", "Generating sample of responses for cost estimation...\n", "Generating 1 responses per prompt...\n", "Responses successfully generated!\n", "Estimated cost for gpt-3.5-turbo-16k-0613: $ 0.6\n", "Token costs were last updated on 10/21/2024 and may have changed since then.\n", "Estimating cost based on 1 generations per prompt...\n", "Generating sample of responses for cost estimation...\n", "Generating 1 responses per prompt...\n", "Responses successfully generated!\n", "Estimated cost for gpt-4-32k-0613: $ 9.16\n" ] } ], "source": [ "for model_name in [\"gpt-3.5-turbo-16k-0613\", \"gpt-4-32k-0613\"]:\n", " estimated_cost = await rg.estimate_token_cost(tiktoken_model_name=model_name, prompts=prompts, count=1)\n", " print(f\"Estimated cost for {model_name}: $\", round(estimated_cost['Estimated Total Token Cost (USD)'],2))" ] }, { "cell_type": "markdown", "id": "8def2169-ddb8-4f5a-9dd9-1e632232ecee", "metadata": {}, "source": [ "Note that using GPT-4 is considerably more expensive than GPT-3.5" ] }, { "cell_type": "markdown", "id": "d0c45962-168b-4624-a55a-89b7af90cb40", "metadata": {}, "source": [ "### Evaluating Response Time: Asynchronous Generation with `ResponseGenerator` vs Synchronous Generation with `openai.chat.completions.create`" ] }, { "cell_type": "markdown", "id": "f2b2fd39-a174-4fc1-aab8-073bc2963f51", "metadata": {}, "source": [ "**Generate responses asynchronously with** `ResponseGenerator`" ] }, { "cell_type": "markdown", "id": "561e5049-79d7-46a9-a490-5950c23de5f4", "metadata": {}, "source": [ "`generate_responses()` - Generates evaluation dataset from a provided set of prompts. For each prompt, `self.count` responses are generated.\n", "\n", "###### Method Parameters:\n", "\n", "- `prompts` - (**list of strings**) A list of prompts\n", "- `system_prompt` - (**str or None, default=\"You are a helpful assistant.\"**) Specifies the system prompt used when generating LLM responses.\n", "- `count` - (**int, default=25**) Specifies number of responses to generate for each prompt. \n", "\n", "###### Returns:\n", "A dictionary with two keys: `data` and `metadata`.\n", "- `data` (**dict**) A dictionary containing the prompts and responses.\n", "- `metadata` (**dict**) A dictionary containing metadata about the generation process, including non-completion rate, temperature, and count." ] }, { "cell_type": "code", "execution_count": 7, "id": "c1413183-6a18-43df-b863-fb603034db46", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generating 1 responses per prompt...\n", "Responses successfully generated!\n", "Time elapsed for asynchronous generation: 17.648038625717163\n" ] } ], "source": [ "# Generate 1 response per prompt for 200 prompts\n", "start = time.time()\n", "async_responses = await rg.generate_responses(\n", " prompts=prompts[0:200], \n", " count=1\n", ")\n", "stop = time.time()\n", "print(f\"Time elapsed for asynchronous generation: {stop - start}\")" ] }, { "cell_type": "code", "execution_count": 8, "id": "5c0f5ebe-b5c9-44f5-9f2c-70c5fdc16a22", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
promptresponse
0Corruption involving the contractors is the ch...report released by an independent watchdog gro...
15][6][7] The term was coined by the author MakiIt seems like there is some missing context in...
2As for the internals, the people with the most...typically the executives and upper management ...
3On March 2, 2017, roughly 100 of ourI'm sorry, but your message appears to be inco...
4Syrian Democratic Forces’ spokesman, Talal Sil...I'm sorry, but I couldn't find any recent info...
.........
195The poet responsible for those words might not...Brandis, but indeed he should be. The poet res...
196“Previously we had a hacked together experienc...combination of different software tools to man...
197In my case, I started life as a first-born son,That's an interesting starting point! Being a ...
198Channel is part of the governmentWhile I am an AI assistant and cannot provide ...
199Like Van Gogh, he wanted to make passionate im...their emotions. Van Gogh was known for his exp...
\n", "

200 rows × 2 columns

\n", "
" ], "text/plain": [ " prompt \\\n", "0 Corruption involving the contractors is the ch... \n", "1 5][6][7] The term was coined by the author Maki \n", "2 As for the internals, the people with the most... \n", "3 On March 2, 2017, roughly 100 of our \n", "4 Syrian Democratic Forces’ spokesman, Talal Sil... \n", ".. ... \n", "195 The poet responsible for those words might not... \n", "196 “Previously we had a hacked together experienc... \n", "197 In my case, I started life as a first-born son, \n", "198 Channel is part of the government \n", "199 Like Van Gogh, he wanted to make passionate im... \n", "\n", " response \n", "0 report released by an independent watchdog gro... \n", "1 It seems like there is some missing context in... \n", "2 typically the executives and upper management ... \n", "3 I'm sorry, but your message appears to be inco... \n", "4 I'm sorry, but I couldn't find any recent info... \n", ".. ... \n", "195 Brandis, but indeed he should be. The poet res... \n", "196 combination of different software tools to man... \n", "197 That's an interesting starting point! Being a ... \n", "198 While I am an AI assistant and cannot provide ... \n", "199 their emotions. Van Gogh was known for his exp... \n", "\n", "[200 rows x 2 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(async_responses['data'])" ] }, { "cell_type": "code", "execution_count": 9, "id": "a7362c25-fc36-46b1-9e53-4b2e25d622c1", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'non_completion_rate': 0.005,\n", " 'system_prompt': 'You are a helpful assistant.',\n", " 'temperature': 1.0,\n", " 'count': 1}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "async_responses['metadata']" ] }, { "cell_type": "markdown", "id": "547b5b87-2910-4178-b68b-24909589f4c8", "metadata": {}, "source": [ "##### Generate responses synchronously for comparison" ] }, { "cell_type": "code", "execution_count": 10, "id": "47ee6e9f-ecaf-4a27-a7b8-753285c86bfc", "metadata": { "tags": [] }, "outputs": [], "source": [ "def openai_api_call(prompt, system_prompt=\"You are a helpful assistant.\", model=\"exai-gpt-35-turbo-16k\"):\n", " try:\n", " completion = openai.chat.completions.create(\n", " model=model,\n", " messages=[\n", " {\"role\": \"system\", \"content\": system_prompt},\n", " {\"role\": \"user\", \"content\": prompt}\n", " ]\n", " )\n", " return completion.choices[0].message.content\n", " except openai.BadRequestError:\n", " return \"Unable to get response\"" ] }, { "cell_type": "code", "execution_count": 10, "id": "0e90423f-81f5-4ece-a663-349392717e8b", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Time elapsed for synchronous generation: 370.58987402915955\n" ] } ], "source": [ "openai.api_key = API_KEY\n", "openai.azure_endpoint = API_BASE\n", "openai.model_version = MODEL_VERSION\n", "openai.api_version = API_VERSION\n", "openai.api_type = API_TYPE\n", "\n", "start = time.time()\n", "sync_responses = [openai_api_call(prompt) for prompt in prompts[0:200]]\n", "stop = time.time()\n", "print(f\"Time elapsed for synchronous generation: {stop - start}\")" ] }, { "cell_type": "markdown", "id": "2b864b1d-e962-41d1-9293-9857bc480a5d", "metadata": {}, "source": [ "Note that asynchronous generation with `ResponseGenerator` is significantly faster than synchonous generation." ] }, { "cell_type": "markdown", "id": "162c62c6-ef18-43fc-a859-fe32a814c09f", "metadata": {}, "source": [ "### Handling `RateLimitError` with `ResponseGenerator`" ] }, { "cell_type": "markdown", "id": "dec46bb6-d7ee-472c-8063-be61fd855413", "metadata": {}, "source": [ "Passing too many requests asynchronously will trigger a `RateLimitError`. For our 'exai-gpt-35-turbo-16k' deployment, 1000 prompts at 25 generations per prompt with async exceeds the rate limit." ] }, { "cell_type": "code", "execution_count": 9, "id": "e80881cd-14c3-46de-bdf0-c8ac607f4a3c", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "langfair: Generating 25 responses per prompt...\n" ] }, { "ename": "RateLimitError", "evalue": "Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 36 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mRateLimitError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[9], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m responses \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m rg\u001b[38;5;241m.\u001b[39mgenerate_responses(prompts\u001b[38;5;241m=\u001b[39mprompts_df\u001b[38;5;241m.\u001b[39mhead(\u001b[38;5;241m1000\u001b[39m)\u001b[38;5;241m.\u001b[39mprompt) \n", "File \u001b[0;32m~/PUBLIC/langfair/langfair/generator/generator.py:231\u001b[0m, in \u001b[0;36mResponseGenerator.generate_responses\u001b[0;34m(self, prompts, system_prompt, count)\u001b[0m\n\u001b[1;32m 229\u001b[0m \u001b[38;5;66;03m# set up langchain and generate asynchronously\u001b[39;00m\n\u001b[1;32m 230\u001b[0m chain \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_setup_langchain(system_message\u001b[38;5;241m=\u001b[39msystem_prompt)\n\u001b[0;32m--> 231\u001b[0m generations, duplicated_prompts \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_generate_in_batches(\n\u001b[1;32m 232\u001b[0m chain\u001b[38;5;241m=\u001b[39mchain, prompts\u001b[38;5;241m=\u001b[39mprompts\n\u001b[1;32m 233\u001b[0m )\n\u001b[1;32m 234\u001b[0m responses \u001b[38;5;241m=\u001b[39m []\n\u001b[1;32m 235\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m response \u001b[38;5;129;01min\u001b[39;00m generations:\n", "File \u001b[0;32m~/PUBLIC/langfair/langfair/generator/generator.py:342\u001b[0m, in \u001b[0;36mResponseGenerator._generate_in_batches\u001b[0;34m(self, chain, prompts, system_prompts)\u001b[0m\n\u001b[1;32m 338\u001b[0m \u001b[38;5;66;03m# generate responses for current batch\u001b[39;00m\n\u001b[1;32m 339\u001b[0m tasks, duplicated_batch_prompts \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_task_creator(\n\u001b[1;32m 340\u001b[0m chain, prompt_batch, system_prompts\n\u001b[1;32m 341\u001b[0m )\n\u001b[0;32m--> 342\u001b[0m responses_batch \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m asyncio\u001b[38;5;241m.\u001b[39mgather(\u001b[38;5;241m*\u001b[39mtasks)\n\u001b[1;32m 344\u001b[0m \u001b[38;5;66;03m# extend lists to include current batch\u001b[39;00m\n\u001b[1;32m 345\u001b[0m duplicated_prompts\u001b[38;5;241m.\u001b[39mextend(duplicated_batch_prompts)\n", "File \u001b[0;32m~/PUBLIC/langfair/langfair/generator/generator.py:364\u001b[0m, in \u001b[0;36mResponseGenerator._async_api_call\u001b[0;34m(chain, prompt, system_text, count)\u001b[0m\n\u001b[1;32m 362\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Generates responses asynchronously using an LLMChain object\"\"\"\u001b[39;00m\n\u001b[1;32m 363\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 364\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m chain\u001b[38;5;241m.\u001b[39magenerate(\n\u001b[1;32m 365\u001b[0m [{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtext\u001b[39m\u001b[38;5;124m\"\u001b[39m: prompt, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msystem_text\u001b[39m\u001b[38;5;124m\"\u001b[39m: system_text}]\n\u001b[1;32m 366\u001b[0m )\n\u001b[1;32m 367\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m [result\u001b[38;5;241m.\u001b[39mgenerations[\u001b[38;5;241m0\u001b[39m][i]\u001b[38;5;241m.\u001b[39mtext \u001b[38;5;28;01mfor\u001b[39;00m i \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mrange\u001b[39m(count)]\n\u001b[1;32m 368\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m (\n\u001b[1;32m 369\u001b[0m openai\u001b[38;5;241m.\u001b[39mAPIConnectionError,\n\u001b[1;32m 370\u001b[0m openai\u001b[38;5;241m.\u001b[39mNotFoundError,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 374\u001b[0m openai\u001b[38;5;241m.\u001b[39mRateLimitError,\n\u001b[1;32m 375\u001b[0m ):\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain/chains/llm.py:165\u001b[0m, in \u001b[0;36mLLMChain.agenerate\u001b[0;34m(self, input_list, run_manager)\u001b[0m\n\u001b[1;32m 163\u001b[0m callbacks \u001b[38;5;241m=\u001b[39m run_manager\u001b[38;5;241m.\u001b[39mget_child() \u001b[38;5;28;01mif\u001b[39;00m run_manager \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 164\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm, BaseLanguageModel):\n\u001b[0;32m--> 165\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm\u001b[38;5;241m.\u001b[39magenerate_prompt(\n\u001b[1;32m 166\u001b[0m prompts,\n\u001b[1;32m 167\u001b[0m stop,\n\u001b[1;32m 168\u001b[0m callbacks\u001b[38;5;241m=\u001b[39mcallbacks,\n\u001b[1;32m 169\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm_kwargs,\n\u001b[1;32m 170\u001b[0m )\n\u001b[1;32m 171\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 172\u001b[0m results \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm\u001b[38;5;241m.\u001b[39mbind(stop\u001b[38;5;241m=\u001b[39mstop, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm_kwargs)\u001b[38;5;241m.\u001b[39mabatch(\n\u001b[1;32m 173\u001b[0m cast(List, prompts), {\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcallbacks\u001b[39m\u001b[38;5;124m\"\u001b[39m: callbacks}\n\u001b[1;32m 174\u001b[0m )\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py:570\u001b[0m, in \u001b[0;36mBaseChatModel.agenerate_prompt\u001b[0;34m(self, prompts, stop, callbacks, **kwargs)\u001b[0m\n\u001b[1;32m 562\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21magenerate_prompt\u001b[39m(\n\u001b[1;32m 563\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m 564\u001b[0m prompts: List[PromptValue],\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 567\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any,\n\u001b[1;32m 568\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m LLMResult:\n\u001b[1;32m 569\u001b[0m prompt_messages \u001b[38;5;241m=\u001b[39m [p\u001b[38;5;241m.\u001b[39mto_messages() \u001b[38;5;28;01mfor\u001b[39;00m p \u001b[38;5;129;01min\u001b[39;00m prompts]\n\u001b[0;32m--> 570\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39magenerate(\n\u001b[1;32m 571\u001b[0m prompt_messages, stop\u001b[38;5;241m=\u001b[39mstop, callbacks\u001b[38;5;241m=\u001b[39mcallbacks, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs\n\u001b[1;32m 572\u001b[0m )\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py:530\u001b[0m, in \u001b[0;36mBaseChatModel.agenerate\u001b[0;34m(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)\u001b[0m\n\u001b[1;32m 517\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m run_managers:\n\u001b[1;32m 518\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m asyncio\u001b[38;5;241m.\u001b[39mgather(\n\u001b[1;32m 519\u001b[0m \u001b[38;5;241m*\u001b[39m[\n\u001b[1;32m 520\u001b[0m run_manager\u001b[38;5;241m.\u001b[39mon_llm_end(\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 528\u001b[0m ]\n\u001b[1;32m 529\u001b[0m )\n\u001b[0;32m--> 530\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m exceptions[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 531\u001b[0m flattened_outputs \u001b[38;5;241m=\u001b[39m [\n\u001b[1;32m 532\u001b[0m LLMResult(generations\u001b[38;5;241m=\u001b[39m[res\u001b[38;5;241m.\u001b[39mgenerations], llm_output\u001b[38;5;241m=\u001b[39mres\u001b[38;5;241m.\u001b[39mllm_output) \u001b[38;5;66;03m# type: ignore[list-item, union-attr]\u001b[39;00m\n\u001b[1;32m 533\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m res \u001b[38;5;129;01min\u001b[39;00m results\n\u001b[1;32m 534\u001b[0m ]\n\u001b[1;32m 535\u001b[0m llm_output \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_combine_llm_outputs([res\u001b[38;5;241m.\u001b[39mllm_output \u001b[38;5;28;01mfor\u001b[39;00m res \u001b[38;5;129;01min\u001b[39;00m results]) \u001b[38;5;66;03m# type: ignore[union-attr]\u001b[39;00m\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py:715\u001b[0m, in \u001b[0;36mBaseChatModel._agenerate_with_cache\u001b[0;34m(self, messages, stop, run_manager, **kwargs)\u001b[0m\n\u001b[1;32m 713\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 714\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m inspect\u001b[38;5;241m.\u001b[39msignature(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agenerate)\u001b[38;5;241m.\u001b[39mparameters\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrun_manager\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m--> 715\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agenerate(\n\u001b[1;32m 716\u001b[0m messages, stop\u001b[38;5;241m=\u001b[39mstop, run_manager\u001b[38;5;241m=\u001b[39mrun_manager, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs\n\u001b[1;32m 717\u001b[0m )\n\u001b[1;32m 718\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 719\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agenerate(messages, stop\u001b[38;5;241m=\u001b[39mstop, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_openai/chat_models/base.py:623\u001b[0m, in \u001b[0;36mBaseChatOpenAI._agenerate\u001b[0;34m(self, messages, stop, run_manager, **kwargs)\u001b[0m\n\u001b[1;32m 621\u001b[0m message_dicts, params \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_create_message_dicts(messages, stop)\n\u001b[1;32m 622\u001b[0m params \u001b[38;5;241m=\u001b[39m {\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mparams, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs}\n\u001b[0;32m--> 623\u001b[0m response \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39masync_client\u001b[38;5;241m.\u001b[39mcreate(messages\u001b[38;5;241m=\u001b[39mmessage_dicts, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mparams)\n\u001b[1;32m 624\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_create_chat_result(response)\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/resources/chat/completions.py:1633\u001b[0m, in \u001b[0;36mAsyncCompletions.create\u001b[0;34m(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, presence_penalty, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)\u001b[0m\n\u001b[1;32m 1593\u001b[0m \u001b[38;5;129m@required_args\u001b[39m([\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessages\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m], [\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessages\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m 1594\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mcreate\u001b[39m(\n\u001b[1;32m 1595\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1630\u001b[0m timeout: \u001b[38;5;28mfloat\u001b[39m \u001b[38;5;241m|\u001b[39m httpx\u001b[38;5;241m.\u001b[39mTimeout \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m|\u001b[39m NotGiven \u001b[38;5;241m=\u001b[39m NOT_GIVEN,\n\u001b[1;32m 1631\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m ChatCompletion \u001b[38;5;241m|\u001b[39m AsyncStream[ChatCompletionChunk]:\n\u001b[1;32m 1632\u001b[0m validate_response_format(response_format)\n\u001b[0;32m-> 1633\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_post(\n\u001b[1;32m 1634\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m/chat/completions\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 1635\u001b[0m body\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mawait\u001b[39;00m async_maybe_transform(\n\u001b[1;32m 1636\u001b[0m {\n\u001b[1;32m 1637\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessages\u001b[39m\u001b[38;5;124m\"\u001b[39m: messages,\n\u001b[1;32m 1638\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m: model,\n\u001b[1;32m 1639\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124maudio\u001b[39m\u001b[38;5;124m\"\u001b[39m: audio,\n\u001b[1;32m 1640\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfrequency_penalty\u001b[39m\u001b[38;5;124m\"\u001b[39m: frequency_penalty,\n\u001b[1;32m 1641\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfunction_call\u001b[39m\u001b[38;5;124m\"\u001b[39m: function_call,\n\u001b[1;32m 1642\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfunctions\u001b[39m\u001b[38;5;124m\"\u001b[39m: functions,\n\u001b[1;32m 1643\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlogit_bias\u001b[39m\u001b[38;5;124m\"\u001b[39m: logit_bias,\n\u001b[1;32m 1644\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlogprobs\u001b[39m\u001b[38;5;124m\"\u001b[39m: logprobs,\n\u001b[1;32m 1645\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_completion_tokens\u001b[39m\u001b[38;5;124m\"\u001b[39m: max_completion_tokens,\n\u001b[1;32m 1646\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_tokens\u001b[39m\u001b[38;5;124m\"\u001b[39m: max_tokens,\n\u001b[1;32m 1647\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmetadata\u001b[39m\u001b[38;5;124m\"\u001b[39m: metadata,\n\u001b[1;32m 1648\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodalities\u001b[39m\u001b[38;5;124m\"\u001b[39m: modalities,\n\u001b[1;32m 1649\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn\u001b[39m\u001b[38;5;124m\"\u001b[39m: n,\n\u001b[1;32m 1650\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mparallel_tool_calls\u001b[39m\u001b[38;5;124m\"\u001b[39m: parallel_tool_calls,\n\u001b[1;32m 1651\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpresence_penalty\u001b[39m\u001b[38;5;124m\"\u001b[39m: presence_penalty,\n\u001b[1;32m 1652\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mresponse_format\u001b[39m\u001b[38;5;124m\"\u001b[39m: response_format,\n\u001b[1;32m 1653\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mseed\u001b[39m\u001b[38;5;124m\"\u001b[39m: seed,\n\u001b[1;32m 1654\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mservice_tier\u001b[39m\u001b[38;5;124m\"\u001b[39m: service_tier,\n\u001b[1;32m 1655\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstop\u001b[39m\u001b[38;5;124m\"\u001b[39m: stop,\n\u001b[1;32m 1656\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstore\u001b[39m\u001b[38;5;124m\"\u001b[39m: store,\n\u001b[1;32m 1657\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream\u001b[39m\u001b[38;5;124m\"\u001b[39m: stream,\n\u001b[1;32m 1658\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream_options\u001b[39m\u001b[38;5;124m\"\u001b[39m: stream_options,\n\u001b[1;32m 1659\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtemperature\u001b[39m\u001b[38;5;124m\"\u001b[39m: temperature,\n\u001b[1;32m 1660\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_choice\u001b[39m\u001b[38;5;124m\"\u001b[39m: tool_choice,\n\u001b[1;32m 1661\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtools\u001b[39m\u001b[38;5;124m\"\u001b[39m: tools,\n\u001b[1;32m 1662\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtop_logprobs\u001b[39m\u001b[38;5;124m\"\u001b[39m: top_logprobs,\n\u001b[1;32m 1663\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtop_p\u001b[39m\u001b[38;5;124m\"\u001b[39m: top_p,\n\u001b[1;32m 1664\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124muser\u001b[39m\u001b[38;5;124m\"\u001b[39m: user,\n\u001b[1;32m 1665\u001b[0m },\n\u001b[1;32m 1666\u001b[0m completion_create_params\u001b[38;5;241m.\u001b[39mCompletionCreateParams,\n\u001b[1;32m 1667\u001b[0m ),\n\u001b[1;32m 1668\u001b[0m options\u001b[38;5;241m=\u001b[39mmake_request_options(\n\u001b[1;32m 1669\u001b[0m extra_headers\u001b[38;5;241m=\u001b[39mextra_headers, extra_query\u001b[38;5;241m=\u001b[39mextra_query, extra_body\u001b[38;5;241m=\u001b[39mextra_body, timeout\u001b[38;5;241m=\u001b[39mtimeout\n\u001b[1;32m 1670\u001b[0m ),\n\u001b[1;32m 1671\u001b[0m cast_to\u001b[38;5;241m=\u001b[39mChatCompletion,\n\u001b[1;32m 1672\u001b[0m stream\u001b[38;5;241m=\u001b[39mstream \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;28;01mFalse\u001b[39;00m,\n\u001b[1;32m 1673\u001b[0m stream_cls\u001b[38;5;241m=\u001b[39mAsyncStream[ChatCompletionChunk],\n\u001b[1;32m 1674\u001b[0m )\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1838\u001b[0m, in \u001b[0;36mAsyncAPIClient.post\u001b[0;34m(self, path, cast_to, body, files, options, stream, stream_cls)\u001b[0m\n\u001b[1;32m 1824\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mpost\u001b[39m(\n\u001b[1;32m 1825\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m 1826\u001b[0m path: \u001b[38;5;28mstr\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1833\u001b[0m stream_cls: \u001b[38;5;28mtype\u001b[39m[_AsyncStreamT] \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m,\n\u001b[1;32m 1834\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m ResponseT \u001b[38;5;241m|\u001b[39m _AsyncStreamT:\n\u001b[1;32m 1835\u001b[0m opts \u001b[38;5;241m=\u001b[39m FinalRequestOptions\u001b[38;5;241m.\u001b[39mconstruct(\n\u001b[1;32m 1836\u001b[0m method\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpost\u001b[39m\u001b[38;5;124m\"\u001b[39m, url\u001b[38;5;241m=\u001b[39mpath, json_data\u001b[38;5;241m=\u001b[39mbody, files\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mawait\u001b[39;00m async_to_httpx_files(files), \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39moptions\n\u001b[1;32m 1837\u001b[0m )\n\u001b[0;32m-> 1838\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrequest(cast_to, opts, stream\u001b[38;5;241m=\u001b[39mstream, stream_cls\u001b[38;5;241m=\u001b[39mstream_cls)\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1532\u001b[0m, in \u001b[0;36mAsyncAPIClient.request\u001b[0;34m(self, cast_to, options, stream, stream_cls, remaining_retries)\u001b[0m\n\u001b[1;32m 1529\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 1530\u001b[0m retries_taken \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m0\u001b[39m\n\u001b[0;32m-> 1532\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_request(\n\u001b[1;32m 1533\u001b[0m cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m 1534\u001b[0m options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[1;32m 1535\u001b[0m stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m 1536\u001b[0m stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m 1537\u001b[0m retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m 1538\u001b[0m )\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1618\u001b[0m, in \u001b[0;36mAsyncAPIClient._request\u001b[0;34m(self, cast_to, options, stream, stream_cls, retries_taken)\u001b[0m\n\u001b[1;32m 1616\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m remaining_retries \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_should_retry(err\u001b[38;5;241m.\u001b[39mresponse):\n\u001b[1;32m 1617\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39maclose()\n\u001b[0;32m-> 1618\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_retry_request(\n\u001b[1;32m 1619\u001b[0m input_options,\n\u001b[1;32m 1620\u001b[0m cast_to,\n\u001b[1;32m 1621\u001b[0m retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m 1622\u001b[0m response_headers\u001b[38;5;241m=\u001b[39merr\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mheaders,\n\u001b[1;32m 1623\u001b[0m stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m 1624\u001b[0m stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m 1625\u001b[0m )\n\u001b[1;32m 1627\u001b[0m \u001b[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001b[39;00m\n\u001b[1;32m 1628\u001b[0m \u001b[38;5;66;03m# to completion before attempting to access the response text.\u001b[39;00m\n\u001b[1;32m 1629\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mis_closed:\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1665\u001b[0m, in \u001b[0;36mAsyncAPIClient._retry_request\u001b[0;34m(self, options, cast_to, retries_taken, response_headers, stream, stream_cls)\u001b[0m\n\u001b[1;32m 1661\u001b[0m log\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRetrying request to \u001b[39m\u001b[38;5;132;01m%s\u001b[39;00m\u001b[38;5;124m in \u001b[39m\u001b[38;5;132;01m%f\u001b[39;00m\u001b[38;5;124m seconds\u001b[39m\u001b[38;5;124m\"\u001b[39m, options\u001b[38;5;241m.\u001b[39murl, timeout)\n\u001b[1;32m 1663\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m anyio\u001b[38;5;241m.\u001b[39msleep(timeout)\n\u001b[0;32m-> 1665\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_request(\n\u001b[1;32m 1666\u001b[0m options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[1;32m 1667\u001b[0m cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m 1668\u001b[0m retries_taken\u001b[38;5;241m=\u001b[39mretries_taken \u001b[38;5;241m+\u001b[39m \u001b[38;5;241m1\u001b[39m,\n\u001b[1;32m 1669\u001b[0m stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m 1670\u001b[0m stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m 1671\u001b[0m )\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1618\u001b[0m, in \u001b[0;36mAsyncAPIClient._request\u001b[0;34m(self, cast_to, options, stream, stream_cls, retries_taken)\u001b[0m\n\u001b[1;32m 1616\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m remaining_retries \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_should_retry(err\u001b[38;5;241m.\u001b[39mresponse):\n\u001b[1;32m 1617\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39maclose()\n\u001b[0;32m-> 1618\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_retry_request(\n\u001b[1;32m 1619\u001b[0m input_options,\n\u001b[1;32m 1620\u001b[0m cast_to,\n\u001b[1;32m 1621\u001b[0m retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m 1622\u001b[0m response_headers\u001b[38;5;241m=\u001b[39merr\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mheaders,\n\u001b[1;32m 1623\u001b[0m stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m 1624\u001b[0m stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m 1625\u001b[0m )\n\u001b[1;32m 1627\u001b[0m \u001b[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001b[39;00m\n\u001b[1;32m 1628\u001b[0m \u001b[38;5;66;03m# to completion before attempting to access the response text.\u001b[39;00m\n\u001b[1;32m 1629\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mis_closed:\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1665\u001b[0m, in \u001b[0;36mAsyncAPIClient._retry_request\u001b[0;34m(self, options, cast_to, retries_taken, response_headers, stream, stream_cls)\u001b[0m\n\u001b[1;32m 1661\u001b[0m log\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRetrying request to \u001b[39m\u001b[38;5;132;01m%s\u001b[39;00m\u001b[38;5;124m in \u001b[39m\u001b[38;5;132;01m%f\u001b[39;00m\u001b[38;5;124m seconds\u001b[39m\u001b[38;5;124m\"\u001b[39m, options\u001b[38;5;241m.\u001b[39murl, timeout)\n\u001b[1;32m 1663\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m anyio\u001b[38;5;241m.\u001b[39msleep(timeout)\n\u001b[0;32m-> 1665\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_request(\n\u001b[1;32m 1666\u001b[0m options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[1;32m 1667\u001b[0m cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m 1668\u001b[0m retries_taken\u001b[38;5;241m=\u001b[39mretries_taken \u001b[38;5;241m+\u001b[39m \u001b[38;5;241m1\u001b[39m,\n\u001b[1;32m 1669\u001b[0m stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m 1670\u001b[0m stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m 1671\u001b[0m )\n", "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1633\u001b[0m, in \u001b[0;36mAsyncAPIClient._request\u001b[0;34m(self, cast_to, options, stream, stream_cls, retries_taken)\u001b[0m\n\u001b[1;32m 1630\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39maread()\n\u001b[1;32m 1632\u001b[0m log\u001b[38;5;241m.\u001b[39mdebug(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRe-raising status error\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m-> 1633\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_make_status_error_from_response(err\u001b[38;5;241m.\u001b[39mresponse) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 1635\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_process_response(\n\u001b[1;32m 1636\u001b[0m cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m 1637\u001b[0m options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1641\u001b[0m retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m 1642\u001b[0m )\n", "\u001b[0;31mRateLimitError\u001b[0m: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 36 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}" ] } ], "source": [ "responses = await rg.generate_responses(prompts=prompts) " ] }, { "cell_type": "markdown", "id": "4f7b4831-d636-406a-a7ce-543c10fdacc8", "metadata": {}, "source": [ "To handle this error, we can use `InMemoryRateLimiter` to limit the number of requests per minute." ] }, { "cell_type": "code", "execution_count": 10, "id": "d21e8b0b-04d0-4697-ab68-d111ac7fcf9d", "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain_core.rate_limiters import InMemoryRateLimiter\n", "\n", "rate_limiter = InMemoryRateLimiter(\n", " requests_per_second=5, \n", " check_every_n_seconds=5, \n", " max_bucket_size=500, \n", ")\n", "\n", "llm = AzureChatOpenAI(\n", " deployment_name=DEPLOYMENT_NAME,\n", " openai_api_key=API_KEY,\n", " azure_endpoint=API_BASE,\n", " openai_api_type=API_TYPE,\n", " openai_api_version=API_VERSION,\n", " temperature=1, # User to set temperature\n", " rate_limiter=rate_limiter\n", ")\n", "\n", "rg_limited = ResponseGenerator(langchain_llm=llm)" ] }, { "cell_type": "code", "execution_count": 11, "id": "06d90c88-2ec2-4d99-9ddd-869d63ee831f", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generating 25 responses per prompt...\n", "Responses successfully generated!\n" ] } ], "source": [ "responses = await rg_limited.generate_responses(prompts=prompts)" ] }, { "cell_type": "code", "execution_count": 13, "id": "0bb24aca", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
promptresponse
0You are required to keep generation given the ...Hosaka in his novel \"Generation X.\" It refers ...
1You are required to keep generation given the ...The term was coined by the author Maki to desc...
2You are required to keep generation given the ...The term was coined by the author Maki [genera...
3You are required to keep generation given the ...The term was coined by the author Maki for the...
4You are required to keep generation given the ...The term was coined by the author Maki to desc...
.........
24995You are required to keep generation given the ...discuss the importance of preserving and hand ...
24996You are required to keep generation given the ...discuss the influence of music on Dave's caree...
24997You are required to keep generation given the ...discuss the evolution of music in the 1970s. T...
24998You are required to keep generation given the ...Close delve into the captivating world of musi...
24999You are required to keep generation given the ...explore the vast influence of legendary musici...
\n", "

25000 rows × 2 columns

\n", "
" ], "text/plain": [ " prompt \\\n", "0 You are required to keep generation given the ... \n", "1 You are required to keep generation given the ... \n", "2 You are required to keep generation given the ... \n", "3 You are required to keep generation given the ... \n", "4 You are required to keep generation given the ... \n", "... ... \n", "24995 You are required to keep generation given the ... \n", "24996 You are required to keep generation given the ... \n", "24997 You are required to keep generation given the ... \n", "24998 You are required to keep generation given the ... \n", "24999 You are required to keep generation given the ... \n", "\n", " response \n", "0 Hosaka in his novel \"Generation X.\" It refers ... \n", "1 The term was coined by the author Maki to desc... \n", "2 The term was coined by the author Maki [genera... \n", "3 The term was coined by the author Maki for the... \n", "4 The term was coined by the author Maki to desc... \n", "... ... \n", "24995 discuss the importance of preserving and hand ... \n", "24996 discuss the influence of music on Dave's caree... \n", "24997 discuss the evolution of music in the 1970s. T... \n", "24998 Close delve into the captivating world of musi... \n", "24999 explore the vast influence of legendary musici... \n", "\n", "[25000 rows x 2 columns]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(responses['data'])" ] } ], "metadata": { "environment": { "kernel": "python3", "name": "workbench-notebooks.m125", "type": "gcloud", "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m125" }, "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.4" } }, "nbformat": 4, "nbformat_minor": 5 }