groq · hozen-groq · Sep 15, 2024 · Jul 9, 2024 · Jul 9, 2024 · Jul 9, 2024
diff --git a/mixture-of-agents/mixture_of_agents.ipynb b/mixture-of-agents/mixture_of_agents.ipynb
@@ -0,0 +1,376 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Mixture of Agents Powered by Groq using Langchain LCEL\n",
+    "\n",
+    "Mixture of Agents (MoA) is an advanced approach in the field of Generative AI and Large Language Models (LLMs) that combines multiple AI models to produce more robust and comprehensive responses. This implementation showcases an agentic workflow, where multiple AI agents collaborate to solve complex tasks, leading to more nuanced and reliable outputs than single-model approaches.\n",
+    "\n",
+    "This notebook demonstrates the implementation of a Mixture of Agents (MoA) architecture using Langchain and Groq. The MoA approach combines multiple open source models to produce responses that are on par or better than SOTA proprietary models like GPT4.\n",
+    "\n",
+    "This tutorial will walk you through how to:\n",
+    "\n",
+    "1. Set up the environment and dependencies.\n",
+    "2. Create helper functions.\n",
+    "3. Configure and build the Mixture of Agents pipeline.\n",
+    "4. Chat with the Agent.\n",
+    "\n",
+    "![Mixture of Agents diagram](moa_diagram.svg)\n",
+    "\n",
+    "You can create a developer account for free at https://console.groq.com/ and generate a free API key to follow this tutorial!\n",
+    "\n",
+    "This implementation is based on the research paper:\n",
+    "\n",
+    "```\n",
+    "@article{wang2024mixture,\n",
+    "  title={Mixture-of-Agents Enhances Large Language Model Capabilities},\n",
+    "  author={Wang, Junlin and Wang, Jue and Athiwaratkun, Ben and Zhang, Ce and Zou, James},\n",
+    "  journal={arXiv preprint arXiv:2406.04692},\n",
+    "  year={2024}\n",
+    "}\n",
+    "```\n",
+    "The main difference between the implementation by the authors of the paper and this notebook is the addition of configurating system prompts of the agents within the layer.\n",
+    "We acknowledge the authors for their contributions to the field and encourage readers to refer to the original paper for a deeper understanding of the Mixture-of-Agents concept."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
+     ]
+    }
+   ],
+   "source": [
+    "!pip install langchain -q\n",
+    "!pip install langchain_groq -q\n",
+    "!pip install langchain_community -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import nest_asyncio\n",
+    "nest_asyncio.apply()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Set up the Environment and Dependencies\n",
+    "\n",
+    "To use [Groq](https://groq.com), you need to make sure that `GROQ_API_KEY` is specified as an environment variable."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"GROQ_API_KEY\"] = \"gsk...\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. Create helper functions\n",
+    "\n",
+    "To help us configure our agentic workflow pipeline, we will need some helper functions:\n",
+    "- `create_agent` : This function takes in a system prompt and returns a Langchain Runnable that we can chain together using LCEL\n",
+    "- `concat_response` : This function takes in a dictionary of inputs, which within the pipeline will be to concenate and format the responses given by the layer agent and returns a string with the formatted response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Dict, Optional, Generator\n",
+    "from textwrap import dedent\n",
+    "\n",
+    "from langchain_groq import ChatGroq\n",
+    "from langchain.memory import ConversationBufferMemory\n",
+    "from langchain_core.runnables import RunnablePassthrough, RunnableLambda, RunnableSerializable\n",
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
+    "\n",
+    "\n",
+    "# Helper method to create an LCEL chain\n",
+    "def create_agent(\n",
+    "    system_prompt: str = \"You are a helpful assistant.\\n{helper_response}\",\n",
+    "    model_name: str = \"llama3-8b-8192\",\n",
+    "    **llm_kwargs\n",
+    ") -> RunnableSerializable[Dict, str]:\n",
+    "    \"\"\"Create a simple Langchain LCEL chain agent based on a system prompt\"\"\"\n",
+    "\n",
+    "    prompt = ChatPromptTemplate.from_messages([\n",
+    "        (\"system\", system_prompt),\n",
+    "        MessagesPlaceholder(variable_name=\"messages\", optional=True),\n",
+    "        (\"human\", \"{input}\")\n",
+    "    ])\n",
+    "\n",
+    "    assert 'helper_response' in prompt.input_variables, \"{helper_response} prompt variable not found in prompt. Please add it\" # To make sure we can add layer agent outputs into the prompt\n",
+    "    llm = ChatGroq(model=model_name, **llm_kwargs)\n",
+    "    \n",
+    "    chain = prompt | llm | StrOutputParser()\n",
+    "    return chain\n",
+    "\n",
+    "def concat_response(\n",
+    "    inputs: Dict[str, str],\n",
+    "    reference_system_prompt: Optional[str] = None\n",
+    ") -> str:\n",
+    "    \"\"\"Concatenate and format layer agent responses\"\"\"\n",
+    "\n",
+    "    REFERENCE_SYSTEM_PROMPT = dedent(\"\"\"\\\n",
+    "    You have been provided with a set of responses from various open-source models to the latest user query. \n",
+    "    Your task is to synthesize these responses into a single, high-quality response. \n",
+    "    It is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect. \n",
+    "    Your response should not simply replicate the given answers but should offer a refined, accurate, and comprehensive reply to the instruction. \n",
+    "    Ensure your response is well-structured, coherent, and adheres to the highest standards of accuracy and reliability.\n",
+    "    Responses from models:\n",
+    "    {responses}\n",
+    "    \"\"\")\n",
+    "    reference_system_prompt = reference_system_prompt or REFERENCE_SYSTEM_PROMPT\n",
+    "\n",
+    "    assert \"{responses}\" in reference_system_prompt, \"{responses} prompt variable not found in prompt. Please add it\"\n",
+    "    responses = \"\"\n",
+    "    res_list = []\n",
+    "    for i, out in enumerate(inputs.values()):\n",
+    "        responses += f\"{i}. {out}\\n\"\n",
+    "        res_list.append(out)\n",
+    "\n",
+    "    formatted_prompt = reference_system_prompt.format(responses=responses)\n",
+    "    return formatted_prompt"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3. Configure and build the Mixture of Agents pipeline.\n",
+    "\n",
+    "Let's configure and build out the whole workflow!\n",
+    "\n",
+    "Here is a breakdown of the different components:\n",
+    "- `CHAT_MEMORY` : This is used to store and retrieve the chat history of the workflow.\n",
+    "- `CYCLES` : Number of times the input and helper responses are passed through to the `LAYER_AGENT`\n",
+    "- `LAYER_AGENT` : Each agent within this layer agent runs in parallel, and the responses are concatenated using the `concat_response` helper function.\n",
+    "- `MAIN_AGENT` : The final agent that responds to the user's query based on the layer agents"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Hyperparameters of agent\n",
+    "# Re run this if you want to delete chats\n",
+    "CHAT_MEMORY = ConversationBufferMemory(\n",
+    "    memory_key=\"messages\",\n",
+    "    return_messages=True\n",
+    ")\n",
+    "CYCLES = 3\n",
+    "LAYER_AGENT = ( # Each layer agent in this dictionary runs in parallel\n",
+    "    {\n",
+    "        'layer_agent_1': RunnablePassthrough() | create_agent(\n",
+    "            system_prompt=\"You are an expert planner agent. Break down and plan out how you can answer the user's question {helper_response}\",\n",
+    "            model_name='llama3-8b-8192'\n",
+    "        ),\n",
+    "        'layer_agent_2': RunnablePassthrough() | create_agent(\n",
+    "            system_prompt=\"Respond with a thought and then your response to the question. {helper_response}\",\n",
+    "            model_name='mixtral-8x7b-32768'\n",
+    "        ),\n",
+    "        'layer_agent_3': RunnablePassthrough() | create_agent(\n",
+    "            system_prompt=\"Think through your response step by step. {helper_response}\",\n",
+    "            model_name='gemma2-9b-it'\n",
+    "        ),\n",
+    "        # Add/Remove agents as needed...\n",
+    "    }\n",
+    "    |\n",
+    "    RunnableLambda(concat_response) # Format layer agent outputs\n",
+    ")\n",
+    "\n",
+    "MAIN_AGENT = create_agent(\n",
+    "    system_prompt=\"You are a helpful assistant named Bob.\\n{helper_response}\",\n",
+    "    model_name=\"llama3-70b-8192\",\n",
+    "    temperature=0.1,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Build the mixture of agents pipeline and create the chat function to ask questions to the agents\n",
+    "\n",
+    "The `chat_stream` function takes in a query and passes it through the mixture of agents workflow.\n",
+    "The query is:\n",
+    "1. Passed through the LAYER_AGENT, which in parallel, generates responses from each of the layer agents and conctenates it using the `concat_response` function.\n",
+    "2. If `CYCLES` is more than 1, it passes through again through the LAYER_AGENT, this time with the previous concatenated responses and the user's query. This repeats `CYCLES` times.\n",
+    "3. The final layer concatenated response and the user's query is passed to the `MAIN_AGENT`, which then stream the final response as and when done. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def chat_stream(query: str) -> Generator[str, None, None]:\n",
+    "    \"\"\"Run Mixture of Agents LCEL pipeline\"\"\"\n",
+    "\n",
+    "    llm_inp = {\n",
+    "    'input': query,\n",
+    "    'messages': CHAT_MEMORY.load_memory_variables({})['messages'],\n",
+    "    'helper_response': \"\"\n",
+    "    }\n",
+    "    for _ in range(CYCLES):\n",
+    "        llm_inp = {\n",
+    "            'input': query,\n",
+    "            'messages': CHAT_MEMORY.load_memory_variables({})['messages'],\n",
+    "            'helper_response': LAYER_AGENT.invoke(llm_inp)\n",
+    "        }\n",
+    "\n",
+    "    response = \"\"\n",
+    "    for chunk in MAIN_AGENT.stream(llm_inp):\n",
+    "        yield chunk\n",
+    "        response += chunk\n",
+    "    \n",
+    "    # Save response to memory\n",
+    "    CHAT_MEMORY.save_context({'input': query}, {'output': response})\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Chat with the Agent\n",
+    "\n",
+    "Let's chat with our mixture of agents!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "User: Write me 10 sentences that end with the word \"apple\"\n",
+      "AI: Here are 10 sentences that end with the word \"apple\":\n",
+      "\n",
+      "1. The farmer's market was filled with baskets of fresh, juicy apple.\n",
+      "2. The aroma of cinnamon and baked goods wafted from the oven, enticing everyone with a warm apple.\n",
+      "3. After a long hike, she rewarded herself with a refreshing, crunchy bite of apple.\n",
+      "4. The art teacher used a vibrant red apple as a still life model for the class apple.\n",
+      "5. The nutritionist recommended eating a daily serving of fiber-rich, antioxidant-packed apple.\n",
+      "6. The chef carefully arranged the sliced apples in a decorative pattern on top of the tart apple.\n",
+      "7. The little boy's eyes widened as he took a big bite out of the sweet, sticky caramel apple.\n",
+      "8. The ecologist studied the unique ecosystem of a rare and endangered species of apple.\n",
+      "9. The athlete refueled with a healthy snack of sliced, organic Granny Smith apple.\n",
+      "10. The autumn air was crisp and cool, carrying the scent of ripe, freshly picked apple.\n",
+      "\n",
+      "I hope you find these sentences helpful!\n",
+      "User: What about banana?\n",
+      "AI: Here are 10 sentences that end with the word \"banana\":\n",
+      "\n",
+      "1. The colorful fruit arrangement on the kitchen counter featured a bright yellow, perfectly ripe banana.\n",
+      "2. The smoothie recipe called for a frozen, creamy, and indulgent banana.\n",
+      "3. The athlete relied on a quick and convenient source of energy, such as a potassium-rich banana.\n",
+      "4. The chef used a ripe, sweet, and creamy banana as a natural sweetener in the recipe banana.\n",
+      "5. The nutritionist recommended combining a banana with a source of protein, such as yogurt or almond butter, for a balanced snack banana.\n",
+      "6. The artist used a banana as a whimsical and unexpected prop in their still life painting banana.\n",
+      "7. The runner munched on a banana for a quick energy boost before her morning run banana.\n",
+      "8. The tropical fruit salad featured a colorful mix of exotic fruits, including a sweet and creamy banana.\n",
+      "9. The childcare provider cut the banana into slices, making it easier for the toddler to eat banana.\n",
+      "10. The astronaut's space rations included a packet of dried, freeze-dried banana for a morale-boosting treat banana.\n",
+      "\n",
+      "I hope you find these sentences helpful!\n",
+      "User: quit\n",
+      "\n",
+      "Stopped by User\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Chat with Agent\n",
+    "while True:\n",
+    "    inp = input(\"\\nAsk a question: \")\n",
+    "    print(f\"\\nUser: {inp}\")\n",
+    "    if inp.lower() == \"quit\":\n",
+    "        print(\"\\nStopped by User\\n\")\n",
+    "        break\n",
+    "    stream = chat_stream(inp)\n",
+    "    print(f\"AI: \", end=\"\")\n",
+    "    for chunk in stream:\n",
+    "        print(chunk, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Conclusion\n",
+    "In this notebook we demonstrated how to build a fairly complex agentic workflow using Groq's fast AI inference.\n",
+    "MoA and other agentic workflows offer significant advantages in working with Large Language Models (LLMs). By enabling LLMs to \"think,\" refine their responses, and break down complex tasks, these approaches enhance accuracy and problem-solving capabilities. Moreover, they present a cost-effective solution by allowing the use of smaller, open-source models in combination, even when multiple LLM calls are required. This notebook serves as an introduction to agentic workflows and in production, should be adapted to your use case and evaluated thoroughly.\n",
+    "\n",
+    "For a more Object Oriented approach, streamlit demo app and an easier way to configure the workflow please checkout [this repo](https://github.com/skapadia3214/groq-moa)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/mixture-of-agents/moa_diagram.svg b/mixture-of-agents/moa_diagram.svg