Jacob/docs (#5807)

* Add explanation to QA with sources page * Fix anchor tag * Standardize prereq format
langchain-ai · Jun 19, 2024 · 387fe31 · 387fe31
1 parent be5739e
commit 387fe31
Show file tree

Hide file tree

Showing 10 changed files with 119 additions and 55 deletions.
diff --git a/docs/core_docs/docs/concepts.mdx b/docs/core_docs/docs/concepts.mdx
@@ -691,7 +691,7 @@ You can check out [this guide](/docs/how_to/streaming/#using-stream) for more de
 
 #### `.streamEvents()`
 
-<span data-heading-keywords="astream_events,stream_events,stream events"></span>
+<span data-heading-keywords="streamEvents,stream events"></span>
 
 While the `.stream()` method is intuitive, it can only return the final generated value of your chain. This is fine for single LLM calls,
 but as you build more complex chains of several LLM calls together, you may want to use the intermediate values of

diff --git a/docs/core_docs/docs/how_to/qa_sources.ipynb b/docs/core_docs/docs/how_to/qa_sources.ipynb
@@ -69,12 +69,12 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 3,
+      "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
         "import \"cheerio\";\n",
-        "import { CheerioWebBaseLoader } from \"langchain/document_loaders/web/cheerio\";\n",
+        "import { CheerioWebBaseLoader } from \"@langchain/community/document_loaders/web/cheerio\";\n",
         "import { RecursiveCharacterTextSplitter } from \"langchain/text_splitter\";\n",
         "import { MemoryVectorStore } from \"langchain/vectorstores/memory\"\n",
         "import { OpenAIEmbeddings, ChatOpenAI } from \"@langchain/openai\";\n",
@@ -119,7 +119,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 4,
+      "execution_count": 2,
       "metadata": {},
       "outputs": [
         {
@@ -139,16 +139,16 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 11,
+      "execution_count": 3,
       "metadata": {},
       "outputs": [
         {
           "data": {
             "text/plain": [
-              "\u001b[32m\"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. I\"\u001b[39m... 208 more characters"
+              "\u001b[32m\"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. T\"\u001b[39m... 287 more characters"
             ]
           },
-          "execution_count": 11,
+          "execution_count": 3,
           "metadata": {},
           "output_type": "execute_result"
         }
@@ -213,7 +213,7 @@
               "      }\n",
               "    }\n",
               "  ],\n",
-              "  answer: \u001b[32m\"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. I\"\u001b[39m... 256 more characters\n",
+              "  answer: \u001b[32m\"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps fo\"\u001b[39m... 232 more characters\n",
               "}"
             ]
           },
@@ -223,20 +223,34 @@
         }
       ],
       "source": [
-        "import { RunnableMap, RunnablePassthrough, RunnableSequence } from \"@langchain/core/runnables\";\n",
+        "import {\n",
+        "  RunnableMap,\n",
+        "  RunnablePassthrough,\n",
+        "  RunnableSequence\n",
+        "} from \"@langchain/core/runnables\";\n",
         "import { formatDocumentsAsString } from \"langchain/util/document\";\n",
         "\n",
-        "const ragChainFromDocs = RunnableSequence.from([\n",
-        "  RunnablePassthrough.assign({ context: (input) => formatDocumentsAsString(input.context) }),\n",
-        "  prompt,\n",
-        "  llm,\n",
-        "  new StringOutputParser()\n",
-        "]);\n",
-        "\n",
-        "let ragChainWithSource = new RunnableMap({ steps: { context: retriever, question: new RunnablePassthrough() }})\n",
-        "ragChainWithSource = ragChainWithSource.assign({ answer: ragChainFromDocs });\n",
+        "const ragChainWithSources = RunnableMap.from({\n",
+        "  // Return raw documents here for now since we want to return them at\n",
+        "  // the end - we'll format in the next step of the chain\n",
+        "  context: retriever,\n",
+        "  question: new RunnablePassthrough(),\n",
+        "}).assign({\n",
+        "  answer: RunnableSequence.from([\n",
+        "    (input) => {\n",
+        "      return {\n",
+        "        // Now we format the documents as strings for the prompt\n",
+        "        context: formatDocumentsAsString(input.context),\n",
+        "        question: input.question\n",
+        "      };\n",
+        "    },\n",
+        "    prompt,\n",
+        "    llm,\n",
+        "    new StringOutputParser()\n",
+        "  ]),\n",
+        "})\n",
         "\n",
-        "await ragChainWithSource.invoke(\"What is Task Decomposition\")"
+        "await ragChainWithSources.invoke(\"What is Task Decomposition\")"
       ]
     },
     {

diff --git a/docs/core_docs/docs/tutorials/agents.mdx b/docs/core_docs/docs/tutorials/agents.mdx
@@ -4,23 +4,23 @@ sidebar_position: 4
 
 # Build an Agent
 
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Chat Models](/docs/concepts/#chat-models)
+- [Tools](/docs/concepts/#tools)
+- [Agents](/docs/concepts/#agents)
+
+:::
+
 By themselves, language models can't take actions - they just output text.
 A big use case for LangChain is creating **agents**.
 Agents are systems that use an LLM as a reasoning enginer to determine which actions to take and what the inputs to those actions should be.
 The results of those actions can then be fed back into the agent and it determine whether more actions are needed, or whether it is okay to finish.
 
 In this tutorial we will build an agent that can interact with multiple different tools: one being a local database, the other being a search engine. You will be able to ask this agent questions, watch it call tools, and have conversations with it.
 
-## Concepts
-
-Concepts we will cover are:
-
-- Using [language models](/docs/concepts/#chat-models), in particular their tool calling ability
-- Creating a [Retriever](/docs/concepts/#retrievers) to expose specific information to our agent
-- Using a Search [Tool](/docs/concepts/#tools) to look up things online
-- Using [LangGraph Agents](/docs/concepts/#agents) which use an LLM to think about what to do and then execute upon that
-- Debugging and tracing your application using [LangSmith](/docs/concepts/#langsmith)
-
 ## Setup: LangSmith
 
 By definition, agents take a self-determined, input-dependent sequence of steps before returning a user-facing output. This makes debugging these systems particularly tricky, and observability particularly important.

diff --git a/docs/core_docs/docs/tutorials/chatbot.ipynb b/docs/core_docs/docs/tutorials/chatbot.ipynb
@@ -22,10 +22,19 @@
    "source": [
     "## Overview\n",
     "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "\n",
+    "- [Chat Models](/docs/concepts/#chat-models)\n",
+    "- [Prompt Templates](/docs/concepts/#prompt-templates)\n",
+    "- [Chat History](/docs/concepts/#chat-history)\n",
+    "\n",
+    ":::\n",
+    "\n",
     "We'll go over an example of how to design and implement an LLM-powered chatbot. \n",
     "This chatbot will be able to have a conversation and remember previous interactions.\n",
     "\n",
-    "\n",
     "Note that this chatbot that we build will only use the language model to have a conversation.\n",
     "There are several other related concepts that you may be looking for:\n",
     "\n",
@@ -34,18 +43,6 @@
     "\n",
     "This tutorial will cover the basics which will be helpful for those two more advanced topics, but feel free to skip directly to there should you choose.\n",
     "\n",
-    "\n",
-    "## Concepts\n",
-    "\n",
-    "Here are a few of the high-level components we'll be working with:\n",
-    "\n",
-    "- [`Chat Models`](/docs/concepts/#chat-models). The chatbot interface is based around messages rather than raw text, and therefore is best suited to Chat Models rather than text LLMs.\n",
-    "- [`Prompt Templates`](/docs/concepts/#prompt-templates), which simplify the process of assembling prompts that combine default messages, user input, chat history, and (optionally) additional retrieved context.\n",
-    "- [`Chat History`](/docs/concepts/#chat-history), which allows a chatbot to \"remember\" past interactions and take them into account when responding to followup questions. \n",
-    "- Debugging and tracing your application using [LangSmith](/docs/concepts/#langsmith)\n",
-    "\n",
-    "We'll cover how to fit the above components together to create a powerful conversational chatbot.\n",
-    "\n",
     "## Setup\n",
     "\n",
     "### Installation\n",

diff --git a/docs/core_docs/docs/tutorials/extraction.ipynb b/docs/core_docs/docs/tutorials/extraction.ipynb
@@ -17,18 +17,21 @@
    "source": [
     "# Build an Extraction Chain\n",
     "\n",
-    "In this tutorial, we will build a chain to extract structured information from unstructured text. \n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "\n",
+    "- [Chat Models](/docs/concepts/#chat-models)\n",
+    "- [Tools](/docs/concepts/#tools)\n",
+    "- [Tool calling](/docs/concepts/#function-tool-calling)\n",
     "\n",
-    ":::{.callout-important}\n",
-    "This tutorial will only work with models that support **function/tool calling**\n",
     ":::\n",
     "\n",
-    "## Concepts\n",
+    "In this tutorial, we will build a chain to extract structured information from unstructured text. \n",
     "\n",
-    "Concepts we will cover are:\n",
-    "- Using [language models](/docs/concepts/#chat-models)\n",
-    "- Using [function/tool calling](/docs/concepts/#function-tool-calling)\n",
-    "- Debugging and tracing your application using [LangSmith](/docs/concepts/#langsmith)\n"
+    ":::{.callout-important}\n",
+    "This tutorial will only work with models that support **function/tool calling**\n",
+    ":::"
    ]
   },
   {

diff --git a/docs/core_docs/docs/tutorials/pdf_qa.ipynb b/docs/core_docs/docs/tutorials/pdf_qa.ipynb
@@ -19,6 +19,18 @@
    "source": [
     "# Build a PDF ingestion and Question/Answering system\n",
     "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "\n",
+    "- [Document loaders](/docs/concepts/#document-loaders)\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Embeddings](/docs/concepts/#embedding-models)\n",
+    "- [Vector stores](/docs/concepts/#vector-stores)\n",
+    "- [Retrieval-augmented generation](/docs/tutorials/rag/)\n",
+    "\n",
+    ":::\n",
+    "\n",
     "PDF files often hold crucial unstructured data unavailable from other sources. They can be quite lengthy, and unlike plain text files, cannot generally be fed directly into the prompt of a language model.\n",
     "\n",
     "In this tutorial, you'll create a system that can answer questions about PDF files. More specifically, you'll use a [Document Loader](/docs/concepts/#document-loaders) to load text in a format usable by an LLM, then build a retrieval-augmented generation (RAG) pipeline to answer questions, including citations from the source material.\n",

diff --git a/docs/core_docs/docs/tutorials/qa_chat_history.ipynb b/docs/core_docs/docs/tutorials/qa_chat_history.ipynb
@@ -6,6 +6,20 @@
       "source": [
         "# Conversational RAG\n",
         "\n",
+        ":::info Prerequisites\n",
+        "\n",
+        "This guide assumes familiarity with the following concepts:\n",
+        "\n",
+        "- [Chat history](/docs/concepts/#chat-history)\n",
+        "- [Chat models](/docs/concepts/#chat-models)\n",
+        "- [Embeddings](/docs/concepts/#embedding-models)\n",
+        "- [Vector stores](/docs/concepts/#vector-stores)\n",
+        "- [Retrieval-augmented generation](/docs/tutorials/rag/)\n",
+        "- [Tools](/docs/concepts/#tools)\n",
+        "- [Agents](/docs/concepts/#agents)\n",
+        "\n",
+        ":::\n",
+        "\n",
         "In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
         "\n",
         "In this guide we focus on **adding logic for incorporating historical messages.** Further details on chat history management is [covered here](/docs/how_to/message_history).\n",

diff --git a/docs/core_docs/docs/tutorials/query_analysis.ipynb b/docs/core_docs/docs/tutorials/query_analysis.ipynb
@@ -5,9 +5,6 @@
       "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
       "metadata": {},
       "source": [
-        "---\n",
-        "sidebar_position: 0\n",
-        "---\n",
         "```{=mdx}\n",
         "import CodeBlock from \"@theme/CodeBlock\";\n",
         "```"
@@ -20,6 +17,18 @@
       "source": [
         "# Build a Query Analysis System\n",
         "\n",
+        ":::info Prerequisites\n",
+        "\n",
+        "This guide assumes familiarity with the following concepts:\n",
+        "\n",
+        "- [Document loaders](/docs/concepts/#document-loaders)\n",
+        "- [Chat models](/docs/concepts/#chat-models)\n",
+        "- [Embeddings](/docs/concepts/#embedding-models)\n",
+        "- [Vector stores](/docs/concepts/#vector-stores)\n",
+        "- [Retrieval](/docs/concepts/#retrieval)\n",
+        "\n",
+        ":::\n",
+        "\n",
         "This page will show how to use query analysis in a basic end-to-end example. This will cover creating a simple search engine, showing a failure mode that occurs when passing a raw user question to that search, and then an example of how query analysis can help address that issue. There are MANY different query analysis techniques and this end-to-end example will not show all of them.\n",
         "\n",
         "For the purpose of this example, we will do retrieval over the LangChain YouTube videos."

diff --git a/docs/core_docs/docs/tutorials/rag.ipynb b/docs/core_docs/docs/tutorials/rag.ipynb
@@ -15,6 +15,9 @@
         "LangSmith will become increasingly helpful as our application grows in\n",
         "complexity.\n",
         "\n",
+        "If you're already familiar with basic retrieval, you might also be interested in\n",
+        "this [high-level overview of different retrieval techinques](/docs/concepts/#retrieval).\n",
+        "\n",
         "## What is RAG?\n",
         "\n",
         "RAG is a technique for augmenting LLM knowledge with additional data.\n",
@@ -35,7 +38,7 @@
         "The most common full sequence from raw data to answer looks like:\n",
         "\n",
         "### Indexing\n",
-        "1. **Load**: First we need to load our data. This is done with [DocumentLoaders](/docs/concepts/#document-loaders).\n",
+        "1. **Load**: First we need to load our data. This is done with [Document Loaders](/docs/concepts/#document-loaders).\n",
         "2. **Split**: [Text splitters](/docs/concepts/#text-splitters) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't fit in a model's finite context window.\n",
         "3. **Store**: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a [VectorStore](/docs/concepts/#vectorstores) and [Embeddings](/docs/concepts/#embedding-models) model.\n",
         "\n",
@@ -842,7 +845,8 @@
         "\n",
         "- [Return sources](/docs/how_to/qa_sources/): Learn how to return source documents\n",
         "- [Streaming](/docs/how_to/qa_streaming/): Learn how to stream outputs and intermediate steps\n",
-        "- [Add chat history](/docs/how_to/qa_chat_history_how_to/): Learn how to add chat history to your app"
+        "- [Add chat history](/docs/how_to/qa_chat_history_how_to/): Learn how to add chat history to your app\n",
+        "- [Retrieval conceptual guide](/docs/concepts/#retrieval): A high-level overview of specific retrieval techniques"
       ]
     }
   ],

diff --git a/docs/core_docs/docs/tutorials/sql_qa.mdx b/docs/core_docs/docs/tutorials/sql_qa.mdx
@@ -1,5 +1,16 @@
 # Build a Question/Answering system over SQL data
 
+:::info Prerequisites
+
+This guide assumes familiarity with the following concepts:
+
+- [Chaining runnables](/docs/how_to/sequence/)
+- [Chat models](/docs/concepts/#chat-models)
+- [Tools](/docs/concepts/#tools)
+- [Agents](/docs/concepts/#agents)
+
+:::
+
 In this guide we'll go over the basic ways to create a Q&A chain and agent over a SQL database.
 These systems will allow us to ask a question about the data in a SQL database and get back a natural language answer.
 The main difference between the two is that our agent can query the database in a loop as many time as it needs to answer the question.