Merge pull request groq#44 from S4mpl3r/main

Add notebook for structured output generation with Instructor
mmrech · Sep 9, 2024 · 0309226 · 0309226
2 parents 8792319 + b450411
commit 0309226
Show file tree

Hide file tree

Showing 2 changed files with 312 additions and 0 deletions.
diff --git a/tutorials/structured-output-instructor/.env.example b/tutorials/structured-output-instructor/.env.example
@@ -0,0 +1 @@
+GROQ_API_KEY=
diff --git a/tutorials/structured-output-instructor/structured_output_instructor.ipynb b/tutorials/structured-output-instructor/structured_output_instructor.ipynb
@@ -0,0 +1,311 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Structured Output with Groq and Instructor\n",
+    "While Large Language Models (LLMs) are often employed for building chatbots or conversational agents, numerous real-world applications require a different approach - one that goes beyond mere dialogue and involves producing structured, machine-readable outputs.\n",
+    "\n",
+    "Consider a typical scenario: we want to produce structured JSON data from an LLM. While tools like Python's `json` module allow us to handle this data, they also come with their own set of challenges, such as validating data types and ensuring consistency across outputs. Manually checking these aspects can be tedious and error-prone. LLMs also tend to forget to include a comma or a closing bracket ('}') somewhere in the produced JSON from time to time, which would invalidate the whole JSON output.\n",
+    "\n",
+    "This is where the Instructor library comes into play. If you've been using the recent feature of [structured outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/) by OpenAI, then you'll feel at home using Instructor. By integrating the Instructor library with models powered by Groq, we can simplify the process of generating structured outputs, making it both easier and more reliable. This guide will walk you through setting up this integration, showcasing how to use structured outputs to generate synthetic data for evaluating LLM-powered applications - a powerful and practical use case for LLMs."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. A Very Simple Use Case\n",
+    "\n",
+    "Let's dive right into how you can set up the `instructor` library with models powered by Groq to generate structured JSON outputs. We'll keep it simple and straightforward so you can get up and running quickly.\n",
+    "\n",
+    "### Grabbing Your API Key\n",
+    "\n",
+    "Before doing anything, grab your Groq API Key from [Groq Console](https://console.groq.com/keys). If you don't already have an account with GroqCloud, you can create one for free. Once you have your Groq API Key, put it in an `.env` file alongside this notebook (you can use the `.env.example` file in this directory and just edit the filename to `.env`):\n",
+    "```bash\n",
+    "GROQ_API_KEY=<YOUR_API_KEY>\n",
+    "```\n",
+    "\n",
+    "### Installing the Necessary Libraries\n",
+    "\n",
+    "Install the required Python libraries. You'll need:\n",
+    "- groq\n",
+    "- instructor\n",
+    "- python-dotenv (for loading environment variables)\n",
+    "\n",
+    "Run the following command to install the libraries:\n",
+    "```bash\n",
+    "pip install -U groq instructor python-dotenv\n",
+    "```\n",
+    "\n",
+    "### Extracting Structured Data\n",
+    "\n",
+    "Let's consider a very simple use case. Imagine you want to extract user details like name, age, and email from a piece of text that you have. To extract this information, you could utilize a JSON schema to define the structure of the user data and pass it to a Groq model (for example, by passing `response_format={\"type\": \"json_object\"}` to the `chat.completions.create()` method). However, creating JSON schemas can be cumbersome. To facilitate this, instructor leverages the Pydantic library, a powerful tool that simplifies the process of describing the output structure of the model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Name: John Doe\n",
+      "Age: 35\n",
+      "Email: johndoe@example.com\n"
+     ]
+    }
+   ],
+   "source": [
+    "import instructor\n",
+    "from dotenv import load_dotenv\n",
+    "from pydantic import BaseModel\n",
+    "from groq import Groq\n",
+    "\n",
+    "# Load the Groq API key from .env file\n",
+    "load_dotenv()\n",
+    "\n",
+    "# Describe the desired output schema using pydantic models\n",
+    "class UserInfo(BaseModel):\n",
+    "    name: str\n",
+    "    age: int\n",
+    "    email: str\n",
+    "\n",
+    "# The text to extract data from\n",
+    "text = \"\"\"\n",
+    "John Doe, a 35-year-old software engineer from New York, has been working with large language models for several years.\n",
+    "His email address is johndoe@example.com.\n",
+    "\"\"\"\n",
+    "\n",
+    "# Patch Groq() with instructor, this is where the magic happens!\n",
+    "client = instructor.from_groq(Groq(), mode=instructor.Mode.JSON)\n",
+    "\n",
+    "# Call the API\n",
+    "user_info = client.chat.completions.create(\n",
+    "    model=\"llama-3.1-70b-versatile\",\n",
+    "    response_model=UserInfo, # Specify the response model\n",
+    "    messages=[\n",
+    "        {\"role\": \"system\", \"content\": \"Your job is to extract user information from the given text.\"},\n",
+    "        {\"role\": \"user\", \"content\": text}\n",
+    "    ],\n",
+    "    temperature=0.65,\n",
+    ")\n",
+    "\n",
+    "print(f\"Name: {user_info.name}\")\n",
+    "print(f\"Age: {user_info.age}\")\n",
+    "print(f\"Email: {user_info.email}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the example above, we've defined a simple pydantic model `UserInfo` that specifies a person's name (as a string), age (as an integer), and email (as a string). The `instructor` library ensures that the Groq model's output adheres to this schema. The great thing here is that the `instructor` library ensures the response is valid according to the schema you provided. This eliminates the need for manual validation and reduces the likelihood of errors creeping into your data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. A More Serious Use Case: Generating Synthetic Data\n",
+    "\n",
+    "Imagine you are designing a weather agent capable of calling functions (tools). This agent is given a `get_weather_info` tool to retrieve the latest weather information about a location. The JSON schema for this tool is provided here:\n",
+    "\n",
+    "```json\n",
+    "{\n",
+    "    \"name\": \"get_weather_info\",\n",
+    "    \"description\": \"Get the weather information for any location.\",\n",
+    "    \"parameters\": {\n",
+    "        \"type\": \"object\",\n",
+    "        \"properties\": {\n",
+    "            \"location\": {\n",
+    "                \"type\": \"string\",\n",
+    "                \"description\": \"The location for which we want to get the weather information (e.g., New York)\" \n",
+    "            }\n",
+    "        },\n",
+    "        \"required\": [\"location\"]\n",
+    "    }\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "Our goal is to create a structured dataset of realistic examples that simulate how a user might request weather information in various scenarios. We want to use a large language model (LLM) to generate these examples for us and use them as an evaluation set to test our agent's capabilities. Without such an evaluation, we lack a way to understand the effects of our prompt adjustments. These examples will not only help us evaluate the agent's ability to use the `get_weather_info` tool correctly but also make it easy to detect if any prompt changes have negative effects.\n",
+    "\n",
+    "Now, let's use the `instructor` library with Groq to generate synthetic examples for our weather agent.\n",
+    "\n",
+    "### Defining the Task and Schema\n",
+    "\n",
+    "To generate these examples, we need to write a prompt that instructs the model to create scenarios where an agent would use the `get_weather_info` tool. We can use the following system prompt for this task:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pprint import pprint\n",
+    "\n",
+    "import instructor\n",
+    "from dotenv import load_dotenv\n",
+    "from pydantic import BaseModel, Field\n",
+    "from groq import Groq\n",
+    "\n",
+    "# Load the Groq API key from .env file\n",
+    "load_dotenv()\n",
+    "\n",
+    "prompt = \"\"\"\n",
+    "I am designing a weather agent. This agent can talk to the user and also fetch latest weather information.\n",
+    "It has access to the `get_weather_info` tool with the following JSON schema:\n",
+    "{json_schema}\n",
+    "\n",
+    "I want you to write some examples for `get_weather_info` and see if this functionality works correctly and can handle all the cases. \n",
+    "Now given the information so far and the JSON schema of the provided tool, write {num} examples.\n",
+    "Make sure each example is varied enough to cover common ways of requesting for this functionality.\n",
+    "Make sure you fill the function parameters with the correct types when generating the output examples. \n",
+    "Make sure your output is valid JSON.\n",
+    "\"\"\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We now need to specify the structure of the output. For this task, I want the output to include the example text, the tool to call, and also the parameters of the tool. Something like the following:\n",
+    "```json\n",
+    "{\n",
+    "    \"examples\": [\n",
+    "        {\n",
+    "            \"input_text\": \"Get the weather information for San Francisco.\",\n",
+    "            \"tool_name\": \"get_weather_info\",\n",
+    "            \"tool_parameters\": \"{\\\"location\\\":\\\"San Francisco\\\"}\"\n",
+    "        },\n",
+    "        ...\n",
+    "    ]\n",
+    "}\n",
+    "```\n",
+    "We can easily translate this structure into a Pydantic model like the following:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class Example(BaseModel):\n",
+    "    input_text: str = Field(description=\"The example text\")\n",
+    "    tool_name: str = Field(description=\"The tool name to call for this example\")\n",
+    "    tool_parameters: str = Field(description=\"An object containing the key-value pairs for the parameters of this tool as a JSON serializbale STRING, make sure it is valid JSON and parameter values are of the correct type according to the tool schema\")\n",
+    "\n",
+    "class ResponseModel(BaseModel):\n",
+    "    examples: list[Example]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Generating the Examples\n",
+    "Now let's call the Groq API with our custom prompt and ask it to generate 5 examples for us:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<class '__main__.ResponseModel'>\n",
+      "[Example(input_text=\"What's the weather like in New York?\", tool_name='get_weather_info', tool_parameters='{\"location\": \"New York\"}'),\n",
+      " Example(input_text='Get the weather forecast for London', tool_name='get_weather_info', tool_parameters='{\"location\": \"London\"}'),\n",
+      " Example(input_text='I want to know the weather in Paris', tool_name='get_weather_info', tool_parameters='{\"location\": \"Paris\"}'),\n",
+      " Example(input_text='What is the weather like in Tokyo, Japan?', tool_name='get_weather_info', tool_parameters='{\"location\": \"Tokyo, Japan\"}'),\n",
+      " Example(input_text='Can you tell me the weather in Sydney, Australia?', tool_name='get_weather_info', tool_parameters='{\"location\": \"Sydney, Australia\"}')]\n"
+     ]
+    }
+   ],
+   "source": [
+    "# The schema for get_weather_info tool\n",
+    "tool_schema = {\n",
+    "    \"name\": \"get_weather_info\",\n",
+    "    \"description\": \"Get the weather information for any location.\",\n",
+    "    \"parameters\": {\n",
+    "        \"type\": \"object\",\n",
+    "        \"properties\": {\n",
+    "            \"location\": {\n",
+    "                \"type\": \"string\",\n",
+    "                \"description\": \"The location for which we want to get the weather information (e.g. New York)\"\n",
+    "            }\n",
+    "        },\n",
+    "        \"required\": [\"location\"]\n",
+    "    }\n",
+    "}\n",
+    "\n",
+    "# Patch Groq() with instructor, this is where the magic happens!\n",
+    "client = instructor.from_groq(Groq(), mode=instructor.Mode.JSON)\n",
+    "\n",
+    "# Call the API with our custom prompt and ResponseModel\n",
+    "response = client.chat.completions.create(\n",
+    "    model=\"llama-3.1-70b-versatile\",\n",
+    "    response_model=ResponseModel, # Specify the response model\n",
+    "    messages=[\n",
+    "        {\n",
+    "            \"role\": \"system\", \n",
+    "            \"content\": prompt.format(json_schema=tool_schema, num=5), # Pass the tool schema and number of examples to the prompt\n",
+    "        },\n",
+    "    ],\n",
+    "    temperature=0.65,\n",
+    "    max_tokens=8000,\n",
+    ")\n",
+    "\n",
+    "print(type(response))\n",
+    "pprint(response.examples)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As you can see, the model returned an object of type `ResponseModel` and has correctly created 5 examples for our evaluation dataset. Great!\n",
+    "\n",
+    "## Conclusion\n",
+    "\n",
+    "With these generated examples, you can evaluate how well the weather agent utilizes the `get_weather_info` tool. By simulating various scenarios, you can test the agent's ability to handle different contexts, ensuring that it correctly identifies when and how to call the tool with the appropriate parameters. This approach enhances not only the evaluation process but also helps identify edge cases where the agent might struggle, allowing you to refine its performance before deploying it in a real-world environment.\n",
+    "\n",
+    "This notebook demonstrates just one of the powerful use cases of structured outputs in LLMs. By combining the `instructor` library with Groq, you can effortlessly create a diverse set of structured examples, making your agentic workflows more robust and reliable.\n",
+    "\n",
+    "### Useful Links\n",
+    "- [Groq API Cookbook](https://github.com/groq/groq-api-cookbook)\n",
+    "- [Instructor Docs](https://python.useinstructor.com/)\n",
+    "- [OpenAI Structured Output Blog](https://openai.com/index/introducing-structured-outputs-in-the-api/)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}