{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a0428976-96c2-4898-9020-1eb2be430cee",
   "metadata": {},
   "source": [
    "# Demo of `ResponseGenerator` class"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7080a2b7-9882-4b09-9bcd-9f7d3d7d80e9",
   "metadata": {},
   "source": [
    "Import necessary libraries for the notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "ad065d4f-b6c6-4e4b-951a-a3e9858d776a",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Run if python-dotenv not installed\n",
    "# import sys\n",
    "# !{sys.executable} -m pip install python-dotenv\n",
    "\n",
    "import os\n",
    "import time\n",
    "\n",
    "import openai\n",
    "import pandas as pd\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "from langfair.generator import ResponseGenerator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "203f3ef4-c8c1-483f-beff-e83c378d4178",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# User to populate .env file with API credentials\n",
    "repo_path = '/'.join(os.getcwd().split('/')[:-2])\n",
    "load_dotenv(os.path.join(repo_path, '.env'))\n",
    "\n",
    "API_KEY = os.getenv('API_KEY')\n",
    "API_BASE = os.getenv('API_BASE')\n",
    "API_TYPE = os.getenv('API_TYPE')\n",
    "API_VERSION = os.getenv('API_VERSION')\n",
    "MODEL_VERSION = os.getenv('MODEL_VERSION')\n",
    "DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e5e2877b-688f-4799-9c31-4a9014f27fad",
   "metadata": {},
   "source": [
    "Read in prompts from which responses will be generated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6411ef21-eef1-49b9-a58a-e183736c0111",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# THIS IS AN EXAMPLE SET OF PROMPTS. USER TO REPLACE WITH THEIR OWN PROMPTS\n",
    "from langfair.utils.dataloader import load_realtoxicity\n",
    "\n",
    "prompts = load_realtoxicity(n=10)\n",
    "print(f\"\\nExample prompt\\n{'-'*14}\\n'{prompts[0]}'\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3fb35e72-d9bb-423d-b7a4-942de0b1d2dd",
   "metadata": {
    "tags": []
   },
   "source": [
    "`ResponseGenerator()` - **Class for generating data for evaluation from provided set of prompts (class)**\n",
    "\n",
    "**Class parameters:**\n",
    "\n",
    "- `langchain_llm` (**langchain llm (Runnable), default=None**) A langchain llm object to get passed to LLMChain `llm` argument.\n",
    "- `suppressed_exceptions` (**tuple, default=None**) Specifies which exceptions to handle as 'Unable to get response' rather than raising the exception\n",
    "- `max_calls_per_min` (**Deprecated as of 0.2.0**) Use LangChain's InMemoryRateLimiter instead."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "076c34e7-cd90-42ba-ac1c-6f7a027e8549",
   "metadata": {},
   "source": [
    "Below we use LangFair's `ResponseGenerator` class to generate LLM responses. To instantiate the `ResponseGenerator` class, pass a LangChain LLM object as an argument. Note that although this notebook uses `AzureChatOpenAI`, this can be replaced with a LangChain LLM of your choice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "febe56d2-1cb1-4712-bf56-f00f62b20ba2",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# # Run if langchain-openai not installed \n",
    "# import sys\n",
    "# !{sys.executable} -m pip install langchain-openai\n",
    "\n",
    "# Example with AzureChatOpenAI. REPLACE WITH YOUR LLM OF CHOICE.\n",
    "from langchain_openai import AzureChatOpenAI\n",
    "\n",
    "llm = AzureChatOpenAI(\n",
    "    deployment_name=DEPLOYMENT_NAME,\n",
    "    openai_api_key=API_KEY,\n",
    "    azure_endpoint=API_BASE,\n",
    "    openai_api_type=API_TYPE,\n",
    "    openai_api_version=API_VERSION,\n",
    "    temperature=1 # User to set temperature\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "72dde77b-b9f1-4eb9-8748-8aac1e90819c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Create langfair ResponseGenerator object\n",
    "rg = ResponseGenerator(\n",
    "    langchain_llm=llm, \n",
    "    suppressed_exceptions=(openai.BadRequestError, ValueError), # this suppresses content filtering errors\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f0b0372f-cc8c-4ab1-b7e1-4ebe5e4ebb50",
   "metadata": {},
   "source": [
    "### Estimate token costs before generation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49460d44-42b8-4e5e-918d-eff1e34ed1a5",
   "metadata": {},
   "source": [
    " `estimate_token_cost()` - Estimates the token cost for a given list of prompts and (optionally) example responses. This method is only compatible with GPT models.\n",
    " \n",
    "###### Method Parameters:\n",
    "\n",
    "- `prompts` - (**list of strings**) A list of prompts.\n",
    "- `example_responses` - (**list of strings, optional**) A list of example responses. If provided, the function will estimate the response tokens based on these examples.\n",
    "- `model_name` - (**str, optional**) The name of the OpenAI model to use for token counting.\n",
    "- `response_sample_size` - (**int, default=30**) The number of responses to generate for cost estimation if `response_example_list` is not provided.\n",
    "- `system_prompt` - (**str, default=\"You are a helpful assistant.\"**) The system prompt to use.\n",
    "- `count` - (**int, default=25**) The number of generations per prompt used when estimating cost.\n",
    "\n",
    "###### Returns:\n",
    "- A dictionary containing the estimated token costs, including prompt token cost, completion token cost, and total token cost. (**dictionary**)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "d860cb28-50db-4b69-b222-782488bfc0fb",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Token costs were last updated on 10/21/2024 and may have changed since then.\n",
      "Estimating cost based on 1 generations per prompt...\n",
      "Generating sample of responses for cost estimation...\n",
      "Generating 1 responses per prompt...\n",
      "Responses successfully generated!\n",
      "Estimated cost for gpt-3.5-turbo-16k-0613: $ 0.6\n",
      "Token costs were last updated on 10/21/2024 and may have changed since then.\n",
      "Estimating cost based on 1 generations per prompt...\n",
      "Generating sample of responses for cost estimation...\n",
      "Generating 1 responses per prompt...\n",
      "Responses successfully generated!\n",
      "Estimated cost for gpt-4-32k-0613: $ 9.16\n"
     ]
    }
   ],
   "source": [
    "for model_name in [\"gpt-3.5-turbo-16k-0613\", \"gpt-4-32k-0613\"]:\n",
    "    estimated_cost = await rg.estimate_token_cost(tiktoken_model_name=model_name, prompts=prompts, count=1)\n",
    "    print(f\"Estimated cost for {model_name}: $\", round(estimated_cost['Estimated Total Token Cost (USD)'],2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8def2169-ddb8-4f5a-9dd9-1e632232ecee",
   "metadata": {},
   "source": [
    "Note that using GPT-4 is considerably more expensive than GPT-3.5"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0c45962-168b-4624-a55a-89b7af90cb40",
   "metadata": {},
   "source": [
    "### Evaluating Response Time: Asynchronous Generation with `ResponseGenerator` vs Synchronous Generation with `openai.chat.completions.create`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2b2fd39-a174-4fc1-aab8-073bc2963f51",
   "metadata": {},
   "source": [
    "**Generate responses asynchronously with** `ResponseGenerator`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "561e5049-79d7-46a9-a490-5950c23de5f4",
   "metadata": {},
   "source": [
    "`generate_responses()` -  Generates evaluation dataset from a provided set of prompts. For each prompt, `self.count` responses are generated.\n",
    "\n",
    "###### Method Parameters:\n",
    "\n",
    "- `prompts` - (**list of strings**) A list of prompts\n",
    "- `system_prompt` - (**str or None, default=\"You are a helpful assistant.\"**) Specifies the system prompt used when generating LLM responses.\n",
    "- `count` - (**int, default=25**) Specifies number of responses to generate for each prompt. \n",
    "\n",
    "###### Returns:\n",
    "A dictionary with two keys: `data` and `metadata`.\n",
    "- `data` (**dict**) A dictionary containing the prompts and responses.\n",
    "- `metadata` (**dict**) A dictionary containing metadata about the generation process, including non-completion rate, temperature, and count."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "c1413183-6a18-43df-b863-fb603034db46",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Generating 1 responses per prompt...\n",
      "Responses successfully generated!\n",
      "Time elapsed for asynchronous generation: 17.648038625717163\n"
     ]
    }
   ],
   "source": [
    "# Generate 1 response per prompt for 200 prompts\n",
    "start = time.time()\n",
    "async_responses = await rg.generate_responses(\n",
    "    prompts=prompts[0:200], \n",
    "    count=1\n",
    ")\n",
    "stop = time.time()\n",
    "print(f\"Time elapsed for asynchronous generation: {stop - start}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "5c0f5ebe-b5c9-44f5-9f2c-70c5fdc16a22",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>prompt</th>\n",
       "      <th>response</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Corruption involving the contractors is the ch...</td>\n",
       "      <td>report released by an independent watchdog gro...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5][6][7] The term was coined by the author Maki</td>\n",
       "      <td>It seems like there is some missing context in...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>As for the internals, the people with the most...</td>\n",
       "      <td>typically the executives and upper management ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>On March 2, 2017, roughly 100 of our</td>\n",
       "      <td>I'm sorry, but your message appears to be inco...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Syrian Democratic Forces’ spokesman, Talal Sil...</td>\n",
       "      <td>I'm sorry, but I couldn't find any recent info...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>195</th>\n",
       "      <td>The poet responsible for those words might not...</td>\n",
       "      <td>Brandis, but indeed he should be. The poet res...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>196</th>\n",
       "      <td>“Previously we had a hacked together experienc...</td>\n",
       "      <td>combination of different software tools to man...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>In my case, I started life as a first-born son,</td>\n",
       "      <td>That's an interesting starting point! Being a ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>Channel is part of the government</td>\n",
       "      <td>While I am an AI assistant and cannot provide ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>Like Van Gogh, he wanted to make passionate im...</td>\n",
       "      <td>their emotions. Van Gogh was known for his exp...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>200 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                prompt  \\\n",
       "0    Corruption involving the contractors is the ch...   \n",
       "1      5][6][7] The term was coined by the author Maki   \n",
       "2    As for the internals, the people with the most...   \n",
       "3                 On March 2, 2017, roughly 100 of our   \n",
       "4    Syrian Democratic Forces’ spokesman, Talal Sil...   \n",
       "..                                                 ...   \n",
       "195  The poet responsible for those words might not...   \n",
       "196  “Previously we had a hacked together experienc...   \n",
       "197    In my case, I started life as a first-born son,   \n",
       "198                  Channel is part of the government   \n",
       "199  Like Van Gogh, he wanted to make passionate im...   \n",
       "\n",
       "                                              response  \n",
       "0    report released by an independent watchdog gro...  \n",
       "1    It seems like there is some missing context in...  \n",
       "2    typically the executives and upper management ...  \n",
       "3    I'm sorry, but your message appears to be inco...  \n",
       "4    I'm sorry, but I couldn't find any recent info...  \n",
       "..                                                 ...  \n",
       "195  Brandis, but indeed he should be. The poet res...  \n",
       "196  combination of different software tools to man...  \n",
       "197  That's an interesting starting point! Being a ...  \n",
       "198  While I am an AI assistant and cannot provide ...  \n",
       "199  their emotions. Van Gogh was known for his exp...  \n",
       "\n",
       "[200 rows x 2 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(async_responses['data'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "a7362c25-fc36-46b1-9e53-4b2e25d622c1",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'non_completion_rate': 0.005,\n",
       " 'system_prompt': 'You are a helpful assistant.',\n",
       " 'temperature': 1.0,\n",
       " 'count': 1}"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "async_responses['metadata']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "547b5b87-2910-4178-b68b-24909589f4c8",
   "metadata": {},
   "source": [
    "##### Generate responses synchronously for comparison"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "47ee6e9f-ecaf-4a27-a7b8-753285c86bfc",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def openai_api_call(prompt, system_prompt=\"You are a helpful assistant.\", model=\"exai-gpt-35-turbo-16k\"):\n",
    "    try:\n",
    "        completion = openai.chat.completions.create(\n",
    "            model=model,\n",
    "            messages=[\n",
    "                {\"role\": \"system\", \"content\": system_prompt},\n",
    "                {\"role\": \"user\", \"content\": prompt}\n",
    "            ]\n",
    "        )\n",
    "        return completion.choices[0].message.content\n",
    "    except openai.BadRequestError:\n",
    "        return \"Unable to get response\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "0e90423f-81f5-4ece-a663-349392717e8b",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time elapsed for synchronous generation: 370.58987402915955\n"
     ]
    }
   ],
   "source": [
    "openai.api_key = API_KEY\n",
    "openai.azure_endpoint = API_BASE\n",
    "openai.model_version = MODEL_VERSION\n",
    "openai.api_version = API_VERSION\n",
    "openai.api_type = API_TYPE\n",
    "\n",
    "start = time.time()\n",
    "sync_responses = [openai_api_call(prompt) for prompt in prompts[0:200]]\n",
    "stop = time.time()\n",
    "print(f\"Time elapsed for synchronous generation: {stop - start}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b864b1d-e962-41d1-9293-9857bc480a5d",
   "metadata": {},
   "source": [
    "Note that asynchronous generation with `ResponseGenerator` is significantly faster than synchonous generation."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "162c62c6-ef18-43fc-a859-fe32a814c09f",
   "metadata": {},
   "source": [
    "### Handling `RateLimitError` with `ResponseGenerator`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dec46bb6-d7ee-472c-8063-be61fd855413",
   "metadata": {},
   "source": [
    "Passing too many requests asynchronously will trigger a `RateLimitError`. For our 'exai-gpt-35-turbo-16k' deployment, 1000 prompts at 25 generations per prompt with async exceeds the rate limit."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "e80881cd-14c3-46de-bdf0-c8ac607f4a3c",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "langfair: Generating 25 responses per prompt...\n"
     ]
    },
    {
     "ename": "RateLimitError",
     "evalue": "Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 36 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mRateLimitError\u001b[0m                            Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[9], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m responses \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m rg\u001b[38;5;241m.\u001b[39mgenerate_responses(prompts\u001b[38;5;241m=\u001b[39mprompts_df\u001b[38;5;241m.\u001b[39mhead(\u001b[38;5;241m1000\u001b[39m)\u001b[38;5;241m.\u001b[39mprompt) \n",
      "File \u001b[0;32m~/PUBLIC/langfair/langfair/generator/generator.py:231\u001b[0m, in \u001b[0;36mResponseGenerator.generate_responses\u001b[0;34m(self, prompts, system_prompt, count)\u001b[0m\n\u001b[1;32m    229\u001b[0m \u001b[38;5;66;03m# set up langchain and generate asynchronously\u001b[39;00m\n\u001b[1;32m    230\u001b[0m chain \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_setup_langchain(system_message\u001b[38;5;241m=\u001b[39msystem_prompt)\n\u001b[0;32m--> 231\u001b[0m generations, duplicated_prompts \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_generate_in_batches(\n\u001b[1;32m    232\u001b[0m     chain\u001b[38;5;241m=\u001b[39mchain, prompts\u001b[38;5;241m=\u001b[39mprompts\n\u001b[1;32m    233\u001b[0m )\n\u001b[1;32m    234\u001b[0m responses \u001b[38;5;241m=\u001b[39m []\n\u001b[1;32m    235\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m response \u001b[38;5;129;01min\u001b[39;00m generations:\n",
      "File \u001b[0;32m~/PUBLIC/langfair/langfair/generator/generator.py:342\u001b[0m, in \u001b[0;36mResponseGenerator._generate_in_batches\u001b[0;34m(self, chain, prompts, system_prompts)\u001b[0m\n\u001b[1;32m    338\u001b[0m \u001b[38;5;66;03m# generate responses for current batch\u001b[39;00m\n\u001b[1;32m    339\u001b[0m tasks, duplicated_batch_prompts \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_task_creator(\n\u001b[1;32m    340\u001b[0m     chain, prompt_batch, system_prompts\n\u001b[1;32m    341\u001b[0m )\n\u001b[0;32m--> 342\u001b[0m responses_batch \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m asyncio\u001b[38;5;241m.\u001b[39mgather(\u001b[38;5;241m*\u001b[39mtasks)\n\u001b[1;32m    344\u001b[0m \u001b[38;5;66;03m# extend lists to include current batch\u001b[39;00m\n\u001b[1;32m    345\u001b[0m duplicated_prompts\u001b[38;5;241m.\u001b[39mextend(duplicated_batch_prompts)\n",
      "File \u001b[0;32m~/PUBLIC/langfair/langfair/generator/generator.py:364\u001b[0m, in \u001b[0;36mResponseGenerator._async_api_call\u001b[0;34m(chain, prompt, system_text, count)\u001b[0m\n\u001b[1;32m    362\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Generates responses asynchronously using an LLMChain object\"\"\"\u001b[39;00m\n\u001b[1;32m    363\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 364\u001b[0m     result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m chain\u001b[38;5;241m.\u001b[39magenerate(\n\u001b[1;32m    365\u001b[0m         [{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtext\u001b[39m\u001b[38;5;124m\"\u001b[39m: prompt, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msystem_text\u001b[39m\u001b[38;5;124m\"\u001b[39m: system_text}]\n\u001b[1;32m    366\u001b[0m     )\n\u001b[1;32m    367\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m [result\u001b[38;5;241m.\u001b[39mgenerations[\u001b[38;5;241m0\u001b[39m][i]\u001b[38;5;241m.\u001b[39mtext \u001b[38;5;28;01mfor\u001b[39;00m i \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mrange\u001b[39m(count)]\n\u001b[1;32m    368\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m (\n\u001b[1;32m    369\u001b[0m     openai\u001b[38;5;241m.\u001b[39mAPIConnectionError,\n\u001b[1;32m    370\u001b[0m     openai\u001b[38;5;241m.\u001b[39mNotFoundError,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    374\u001b[0m     openai\u001b[38;5;241m.\u001b[39mRateLimitError,\n\u001b[1;32m    375\u001b[0m ):\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain/chains/llm.py:165\u001b[0m, in \u001b[0;36mLLMChain.agenerate\u001b[0;34m(self, input_list, run_manager)\u001b[0m\n\u001b[1;32m    163\u001b[0m callbacks \u001b[38;5;241m=\u001b[39m run_manager\u001b[38;5;241m.\u001b[39mget_child() \u001b[38;5;28;01mif\u001b[39;00m run_manager \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m    164\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm, BaseLanguageModel):\n\u001b[0;32m--> 165\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm\u001b[38;5;241m.\u001b[39magenerate_prompt(\n\u001b[1;32m    166\u001b[0m         prompts,\n\u001b[1;32m    167\u001b[0m         stop,\n\u001b[1;32m    168\u001b[0m         callbacks\u001b[38;5;241m=\u001b[39mcallbacks,\n\u001b[1;32m    169\u001b[0m         \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm_kwargs,\n\u001b[1;32m    170\u001b[0m     )\n\u001b[1;32m    171\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m    172\u001b[0m     results \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm\u001b[38;5;241m.\u001b[39mbind(stop\u001b[38;5;241m=\u001b[39mstop, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mllm_kwargs)\u001b[38;5;241m.\u001b[39mabatch(\n\u001b[1;32m    173\u001b[0m         cast(List, prompts), {\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcallbacks\u001b[39m\u001b[38;5;124m\"\u001b[39m: callbacks}\n\u001b[1;32m    174\u001b[0m     )\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py:570\u001b[0m, in \u001b[0;36mBaseChatModel.agenerate_prompt\u001b[0;34m(self, prompts, stop, callbacks, **kwargs)\u001b[0m\n\u001b[1;32m    562\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21magenerate_prompt\u001b[39m(\n\u001b[1;32m    563\u001b[0m     \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m    564\u001b[0m     prompts: List[PromptValue],\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    567\u001b[0m     \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any,\n\u001b[1;32m    568\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m LLMResult:\n\u001b[1;32m    569\u001b[0m     prompt_messages \u001b[38;5;241m=\u001b[39m [p\u001b[38;5;241m.\u001b[39mto_messages() \u001b[38;5;28;01mfor\u001b[39;00m p \u001b[38;5;129;01min\u001b[39;00m prompts]\n\u001b[0;32m--> 570\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39magenerate(\n\u001b[1;32m    571\u001b[0m         prompt_messages, stop\u001b[38;5;241m=\u001b[39mstop, callbacks\u001b[38;5;241m=\u001b[39mcallbacks, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs\n\u001b[1;32m    572\u001b[0m     )\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py:530\u001b[0m, in \u001b[0;36mBaseChatModel.agenerate\u001b[0;34m(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)\u001b[0m\n\u001b[1;32m    517\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m run_managers:\n\u001b[1;32m    518\u001b[0m         \u001b[38;5;28;01mawait\u001b[39;00m asyncio\u001b[38;5;241m.\u001b[39mgather(\n\u001b[1;32m    519\u001b[0m             \u001b[38;5;241m*\u001b[39m[\n\u001b[1;32m    520\u001b[0m                 run_manager\u001b[38;5;241m.\u001b[39mon_llm_end(\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    528\u001b[0m             ]\n\u001b[1;32m    529\u001b[0m         )\n\u001b[0;32m--> 530\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m exceptions[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m    531\u001b[0m flattened_outputs \u001b[38;5;241m=\u001b[39m [\n\u001b[1;32m    532\u001b[0m     LLMResult(generations\u001b[38;5;241m=\u001b[39m[res\u001b[38;5;241m.\u001b[39mgenerations], llm_output\u001b[38;5;241m=\u001b[39mres\u001b[38;5;241m.\u001b[39mllm_output)  \u001b[38;5;66;03m# type: ignore[list-item, union-attr]\u001b[39;00m\n\u001b[1;32m    533\u001b[0m     \u001b[38;5;28;01mfor\u001b[39;00m res \u001b[38;5;129;01min\u001b[39;00m results\n\u001b[1;32m    534\u001b[0m ]\n\u001b[1;32m    535\u001b[0m llm_output \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_combine_llm_outputs([res\u001b[38;5;241m.\u001b[39mllm_output \u001b[38;5;28;01mfor\u001b[39;00m res \u001b[38;5;129;01min\u001b[39;00m results])  \u001b[38;5;66;03m# type: ignore[union-attr]\u001b[39;00m\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_core/language_models/chat_models.py:715\u001b[0m, in \u001b[0;36mBaseChatModel._agenerate_with_cache\u001b[0;34m(self, messages, stop, run_manager, **kwargs)\u001b[0m\n\u001b[1;32m    713\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m    714\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m inspect\u001b[38;5;241m.\u001b[39msignature(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agenerate)\u001b[38;5;241m.\u001b[39mparameters\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrun_manager\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m--> 715\u001b[0m         result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agenerate(\n\u001b[1;32m    716\u001b[0m             messages, stop\u001b[38;5;241m=\u001b[39mstop, run_manager\u001b[38;5;241m=\u001b[39mrun_manager, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs\n\u001b[1;32m    717\u001b[0m         )\n\u001b[1;32m    718\u001b[0m     \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m    719\u001b[0m         result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agenerate(messages, stop\u001b[38;5;241m=\u001b[39mstop, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/langchain_openai/chat_models/base.py:623\u001b[0m, in \u001b[0;36mBaseChatOpenAI._agenerate\u001b[0;34m(self, messages, stop, run_manager, **kwargs)\u001b[0m\n\u001b[1;32m    621\u001b[0m message_dicts, params \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_create_message_dicts(messages, stop)\n\u001b[1;32m    622\u001b[0m params \u001b[38;5;241m=\u001b[39m {\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mparams, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs}\n\u001b[0;32m--> 623\u001b[0m response \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39masync_client\u001b[38;5;241m.\u001b[39mcreate(messages\u001b[38;5;241m=\u001b[39mmessage_dicts, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mparams)\n\u001b[1;32m    624\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_create_chat_result(response)\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/resources/chat/completions.py:1633\u001b[0m, in \u001b[0;36mAsyncCompletions.create\u001b[0;34m(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, presence_penalty, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)\u001b[0m\n\u001b[1;32m   1593\u001b[0m \u001b[38;5;129m@required_args\u001b[39m([\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessages\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m], [\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessages\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m   1594\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mcreate\u001b[39m(\n\u001b[1;32m   1595\u001b[0m     \u001b[38;5;28mself\u001b[39m,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   1630\u001b[0m     timeout: \u001b[38;5;28mfloat\u001b[39m \u001b[38;5;241m|\u001b[39m httpx\u001b[38;5;241m.\u001b[39mTimeout \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m|\u001b[39m NotGiven \u001b[38;5;241m=\u001b[39m NOT_GIVEN,\n\u001b[1;32m   1631\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m ChatCompletion \u001b[38;5;241m|\u001b[39m AsyncStream[ChatCompletionChunk]:\n\u001b[1;32m   1632\u001b[0m     validate_response_format(response_format)\n\u001b[0;32m-> 1633\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_post(\n\u001b[1;32m   1634\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m/chat/completions\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m   1635\u001b[0m         body\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mawait\u001b[39;00m async_maybe_transform(\n\u001b[1;32m   1636\u001b[0m             {\n\u001b[1;32m   1637\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessages\u001b[39m\u001b[38;5;124m\"\u001b[39m: messages,\n\u001b[1;32m   1638\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m: model,\n\u001b[1;32m   1639\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124maudio\u001b[39m\u001b[38;5;124m\"\u001b[39m: audio,\n\u001b[1;32m   1640\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfrequency_penalty\u001b[39m\u001b[38;5;124m\"\u001b[39m: frequency_penalty,\n\u001b[1;32m   1641\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfunction_call\u001b[39m\u001b[38;5;124m\"\u001b[39m: function_call,\n\u001b[1;32m   1642\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfunctions\u001b[39m\u001b[38;5;124m\"\u001b[39m: functions,\n\u001b[1;32m   1643\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlogit_bias\u001b[39m\u001b[38;5;124m\"\u001b[39m: logit_bias,\n\u001b[1;32m   1644\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlogprobs\u001b[39m\u001b[38;5;124m\"\u001b[39m: logprobs,\n\u001b[1;32m   1645\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_completion_tokens\u001b[39m\u001b[38;5;124m\"\u001b[39m: max_completion_tokens,\n\u001b[1;32m   1646\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_tokens\u001b[39m\u001b[38;5;124m\"\u001b[39m: max_tokens,\n\u001b[1;32m   1647\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmetadata\u001b[39m\u001b[38;5;124m\"\u001b[39m: metadata,\n\u001b[1;32m   1648\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodalities\u001b[39m\u001b[38;5;124m\"\u001b[39m: modalities,\n\u001b[1;32m   1649\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn\u001b[39m\u001b[38;5;124m\"\u001b[39m: n,\n\u001b[1;32m   1650\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mparallel_tool_calls\u001b[39m\u001b[38;5;124m\"\u001b[39m: parallel_tool_calls,\n\u001b[1;32m   1651\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpresence_penalty\u001b[39m\u001b[38;5;124m\"\u001b[39m: presence_penalty,\n\u001b[1;32m   1652\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mresponse_format\u001b[39m\u001b[38;5;124m\"\u001b[39m: response_format,\n\u001b[1;32m   1653\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mseed\u001b[39m\u001b[38;5;124m\"\u001b[39m: seed,\n\u001b[1;32m   1654\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mservice_tier\u001b[39m\u001b[38;5;124m\"\u001b[39m: service_tier,\n\u001b[1;32m   1655\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstop\u001b[39m\u001b[38;5;124m\"\u001b[39m: stop,\n\u001b[1;32m   1656\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstore\u001b[39m\u001b[38;5;124m\"\u001b[39m: store,\n\u001b[1;32m   1657\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream\u001b[39m\u001b[38;5;124m\"\u001b[39m: stream,\n\u001b[1;32m   1658\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream_options\u001b[39m\u001b[38;5;124m\"\u001b[39m: stream_options,\n\u001b[1;32m   1659\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtemperature\u001b[39m\u001b[38;5;124m\"\u001b[39m: temperature,\n\u001b[1;32m   1660\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_choice\u001b[39m\u001b[38;5;124m\"\u001b[39m: tool_choice,\n\u001b[1;32m   1661\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtools\u001b[39m\u001b[38;5;124m\"\u001b[39m: tools,\n\u001b[1;32m   1662\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtop_logprobs\u001b[39m\u001b[38;5;124m\"\u001b[39m: top_logprobs,\n\u001b[1;32m   1663\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtop_p\u001b[39m\u001b[38;5;124m\"\u001b[39m: top_p,\n\u001b[1;32m   1664\u001b[0m                 \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124muser\u001b[39m\u001b[38;5;124m\"\u001b[39m: user,\n\u001b[1;32m   1665\u001b[0m             },\n\u001b[1;32m   1666\u001b[0m             completion_create_params\u001b[38;5;241m.\u001b[39mCompletionCreateParams,\n\u001b[1;32m   1667\u001b[0m         ),\n\u001b[1;32m   1668\u001b[0m         options\u001b[38;5;241m=\u001b[39mmake_request_options(\n\u001b[1;32m   1669\u001b[0m             extra_headers\u001b[38;5;241m=\u001b[39mextra_headers, extra_query\u001b[38;5;241m=\u001b[39mextra_query, extra_body\u001b[38;5;241m=\u001b[39mextra_body, timeout\u001b[38;5;241m=\u001b[39mtimeout\n\u001b[1;32m   1670\u001b[0m         ),\n\u001b[1;32m   1671\u001b[0m         cast_to\u001b[38;5;241m=\u001b[39mChatCompletion,\n\u001b[1;32m   1672\u001b[0m         stream\u001b[38;5;241m=\u001b[39mstream \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;28;01mFalse\u001b[39;00m,\n\u001b[1;32m   1673\u001b[0m         stream_cls\u001b[38;5;241m=\u001b[39mAsyncStream[ChatCompletionChunk],\n\u001b[1;32m   1674\u001b[0m     )\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1838\u001b[0m, in \u001b[0;36mAsyncAPIClient.post\u001b[0;34m(self, path, cast_to, body, files, options, stream, stream_cls)\u001b[0m\n\u001b[1;32m   1824\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mpost\u001b[39m(\n\u001b[1;32m   1825\u001b[0m     \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m   1826\u001b[0m     path: \u001b[38;5;28mstr\u001b[39m,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   1833\u001b[0m     stream_cls: \u001b[38;5;28mtype\u001b[39m[_AsyncStreamT] \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m,\n\u001b[1;32m   1834\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m ResponseT \u001b[38;5;241m|\u001b[39m _AsyncStreamT:\n\u001b[1;32m   1835\u001b[0m     opts \u001b[38;5;241m=\u001b[39m FinalRequestOptions\u001b[38;5;241m.\u001b[39mconstruct(\n\u001b[1;32m   1836\u001b[0m         method\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpost\u001b[39m\u001b[38;5;124m\"\u001b[39m, url\u001b[38;5;241m=\u001b[39mpath, json_data\u001b[38;5;241m=\u001b[39mbody, files\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mawait\u001b[39;00m async_to_httpx_files(files), \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39moptions\n\u001b[1;32m   1837\u001b[0m     )\n\u001b[0;32m-> 1838\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrequest(cast_to, opts, stream\u001b[38;5;241m=\u001b[39mstream, stream_cls\u001b[38;5;241m=\u001b[39mstream_cls)\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1532\u001b[0m, in \u001b[0;36mAsyncAPIClient.request\u001b[0;34m(self, cast_to, options, stream, stream_cls, remaining_retries)\u001b[0m\n\u001b[1;32m   1529\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m   1530\u001b[0m     retries_taken \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m0\u001b[39m\n\u001b[0;32m-> 1532\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_request(\n\u001b[1;32m   1533\u001b[0m     cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m   1534\u001b[0m     options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[1;32m   1535\u001b[0m     stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m   1536\u001b[0m     stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m   1537\u001b[0m     retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m   1538\u001b[0m )\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1618\u001b[0m, in \u001b[0;36mAsyncAPIClient._request\u001b[0;34m(self, cast_to, options, stream, stream_cls, retries_taken)\u001b[0m\n\u001b[1;32m   1616\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m remaining_retries \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_should_retry(err\u001b[38;5;241m.\u001b[39mresponse):\n\u001b[1;32m   1617\u001b[0m     \u001b[38;5;28;01mawait\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39maclose()\n\u001b[0;32m-> 1618\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_retry_request(\n\u001b[1;32m   1619\u001b[0m         input_options,\n\u001b[1;32m   1620\u001b[0m         cast_to,\n\u001b[1;32m   1621\u001b[0m         retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m   1622\u001b[0m         response_headers\u001b[38;5;241m=\u001b[39merr\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mheaders,\n\u001b[1;32m   1623\u001b[0m         stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m   1624\u001b[0m         stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m   1625\u001b[0m     )\n\u001b[1;32m   1627\u001b[0m \u001b[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001b[39;00m\n\u001b[1;32m   1628\u001b[0m \u001b[38;5;66;03m# to completion before attempting to access the response text.\u001b[39;00m\n\u001b[1;32m   1629\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mis_closed:\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1665\u001b[0m, in \u001b[0;36mAsyncAPIClient._retry_request\u001b[0;34m(self, options, cast_to, retries_taken, response_headers, stream, stream_cls)\u001b[0m\n\u001b[1;32m   1661\u001b[0m log\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRetrying request to \u001b[39m\u001b[38;5;132;01m%s\u001b[39;00m\u001b[38;5;124m in \u001b[39m\u001b[38;5;132;01m%f\u001b[39;00m\u001b[38;5;124m seconds\u001b[39m\u001b[38;5;124m\"\u001b[39m, options\u001b[38;5;241m.\u001b[39murl, timeout)\n\u001b[1;32m   1663\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m anyio\u001b[38;5;241m.\u001b[39msleep(timeout)\n\u001b[0;32m-> 1665\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_request(\n\u001b[1;32m   1666\u001b[0m     options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[1;32m   1667\u001b[0m     cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m   1668\u001b[0m     retries_taken\u001b[38;5;241m=\u001b[39mretries_taken \u001b[38;5;241m+\u001b[39m \u001b[38;5;241m1\u001b[39m,\n\u001b[1;32m   1669\u001b[0m     stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m   1670\u001b[0m     stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m   1671\u001b[0m )\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1618\u001b[0m, in \u001b[0;36mAsyncAPIClient._request\u001b[0;34m(self, cast_to, options, stream, stream_cls, retries_taken)\u001b[0m\n\u001b[1;32m   1616\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m remaining_retries \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_should_retry(err\u001b[38;5;241m.\u001b[39mresponse):\n\u001b[1;32m   1617\u001b[0m     \u001b[38;5;28;01mawait\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39maclose()\n\u001b[0;32m-> 1618\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_retry_request(\n\u001b[1;32m   1619\u001b[0m         input_options,\n\u001b[1;32m   1620\u001b[0m         cast_to,\n\u001b[1;32m   1621\u001b[0m         retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m   1622\u001b[0m         response_headers\u001b[38;5;241m=\u001b[39merr\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mheaders,\n\u001b[1;32m   1623\u001b[0m         stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m   1624\u001b[0m         stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m   1625\u001b[0m     )\n\u001b[1;32m   1627\u001b[0m \u001b[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001b[39;00m\n\u001b[1;32m   1628\u001b[0m \u001b[38;5;66;03m# to completion before attempting to access the response text.\u001b[39;00m\n\u001b[1;32m   1629\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39mis_closed:\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1665\u001b[0m, in \u001b[0;36mAsyncAPIClient._retry_request\u001b[0;34m(self, options, cast_to, retries_taken, response_headers, stream, stream_cls)\u001b[0m\n\u001b[1;32m   1661\u001b[0m log\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRetrying request to \u001b[39m\u001b[38;5;132;01m%s\u001b[39;00m\u001b[38;5;124m in \u001b[39m\u001b[38;5;132;01m%f\u001b[39;00m\u001b[38;5;124m seconds\u001b[39m\u001b[38;5;124m\"\u001b[39m, options\u001b[38;5;241m.\u001b[39murl, timeout)\n\u001b[1;32m   1663\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m anyio\u001b[38;5;241m.\u001b[39msleep(timeout)\n\u001b[0;32m-> 1665\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_request(\n\u001b[1;32m   1666\u001b[0m     options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[1;32m   1667\u001b[0m     cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m   1668\u001b[0m     retries_taken\u001b[38;5;241m=\u001b[39mretries_taken \u001b[38;5;241m+\u001b[39m \u001b[38;5;241m1\u001b[39m,\n\u001b[1;32m   1669\u001b[0m     stream\u001b[38;5;241m=\u001b[39mstream,\n\u001b[1;32m   1670\u001b[0m     stream_cls\u001b[38;5;241m=\u001b[39mstream_cls,\n\u001b[1;32m   1671\u001b[0m )\n",
      "File \u001b[0;32m/opt/conda/envs/langfair/lib/python3.9/site-packages/openai/_base_client.py:1633\u001b[0m, in \u001b[0;36mAsyncAPIClient._request\u001b[0;34m(self, cast_to, options, stream, stream_cls, retries_taken)\u001b[0m\n\u001b[1;32m   1630\u001b[0m         \u001b[38;5;28;01mawait\u001b[39;00m err\u001b[38;5;241m.\u001b[39mresponse\u001b[38;5;241m.\u001b[39maread()\n\u001b[1;32m   1632\u001b[0m     log\u001b[38;5;241m.\u001b[39mdebug(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRe-raising status error\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m-> 1633\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_make_status_error_from_response(err\u001b[38;5;241m.\u001b[39mresponse) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m   1635\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_process_response(\n\u001b[1;32m   1636\u001b[0m     cast_to\u001b[38;5;241m=\u001b[39mcast_to,\n\u001b[1;32m   1637\u001b[0m     options\u001b[38;5;241m=\u001b[39moptions,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   1641\u001b[0m     retries_taken\u001b[38;5;241m=\u001b[39mretries_taken,\n\u001b[1;32m   1642\u001b[0m )\n",
      "\u001b[0;31mRateLimitError\u001b[0m: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 36 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}"
     ]
    }
   ],
   "source": [
    "responses = await rg.generate_responses(prompts=prompts) "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f7b4831-d636-406a-a7ce-543c10fdacc8",
   "metadata": {},
   "source": [
    "To handle this error, we can use `InMemoryRateLimiter` to limit the number of requests per minute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "d21e8b0b-04d0-4697-ab68-d111ac7fcf9d",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain_core.rate_limiters import InMemoryRateLimiter\n",
    "\n",
    "rate_limiter = InMemoryRateLimiter(\n",
    "    requests_per_second=5, \n",
    "    check_every_n_seconds=5, \n",
    "    max_bucket_size=500,  \n",
    ")\n",
    "\n",
    "llm = AzureChatOpenAI(\n",
    "    deployment_name=DEPLOYMENT_NAME,\n",
    "    openai_api_key=API_KEY,\n",
    "    azure_endpoint=API_BASE,\n",
    "    openai_api_type=API_TYPE,\n",
    "    openai_api_version=API_VERSION,\n",
    "    temperature=1, # User to set temperature\n",
    "    rate_limiter=rate_limiter\n",
    ")\n",
    "\n",
    "rg_limited = ResponseGenerator(langchain_llm=llm)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "06d90c88-2ec2-4d99-9ddd-869d63ee831f",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Generating 25 responses per prompt...\n",
      "Responses successfully generated!\n"
     ]
    }
   ],
   "source": [
    "responses = await rg_limited.generate_responses(prompts=prompts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "0bb24aca",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>prompt</th>\n",
       "      <th>response</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>Hosaka in his novel \"Generation X.\" It refers ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>The term was coined by the author Maki to desc...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>The term was coined by the author Maki [genera...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>The term was coined by the author Maki for the...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>The term was coined by the author Maki to desc...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24995</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>discuss the importance of preserving and hand ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24996</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>discuss the influence of music on Dave's caree...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24997</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>discuss the evolution of music in the 1970s. T...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24998</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>Close delve into the captivating world of musi...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24999</th>\n",
       "      <td>You are required to keep generation given the ...</td>\n",
       "      <td>explore the vast influence of legendary musici...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>25000 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  prompt  \\\n",
       "0      You are required to keep generation given the ...   \n",
       "1      You are required to keep generation given the ...   \n",
       "2      You are required to keep generation given the ...   \n",
       "3      You are required to keep generation given the ...   \n",
       "4      You are required to keep generation given the ...   \n",
       "...                                                  ...   \n",
       "24995  You are required to keep generation given the ...   \n",
       "24996  You are required to keep generation given the ...   \n",
       "24997  You are required to keep generation given the ...   \n",
       "24998  You are required to keep generation given the ...   \n",
       "24999  You are required to keep generation given the ...   \n",
       "\n",
       "                                                response  \n",
       "0      Hosaka in his novel \"Generation X.\" It refers ...  \n",
       "1      The term was coined by the author Maki to desc...  \n",
       "2      The term was coined by the author Maki [genera...  \n",
       "3      The term was coined by the author Maki for the...  \n",
       "4      The term was coined by the author Maki to desc...  \n",
       "...                                                  ...  \n",
       "24995  discuss the importance of preserving and hand ...  \n",
       "24996  discuss the influence of music on Dave's caree...  \n",
       "24997  discuss the evolution of music in the 1970s. T...  \n",
       "24998  Close delve into the captivating world of musi...  \n",
       "24999  explore the vast influence of legendary musici...  \n",
       "\n",
       "[25000 rows x 2 columns]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(responses['data'])"
   ]
  }
 ],
 "metadata": {
  "environment": {
   "kernel": "python3",
   "name": "workbench-notebooks.m125",
   "type": "gcloud",
   "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m125"
  },
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}