Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Update AI21Embeddings Integration docs #25298

Merged
merged 3 commits into from
Aug 14, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
228 changes: 180 additions & 48 deletions docs/docs/integrations/text_embedding/ai21.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,116 +2,248 @@
"cells": [
{
"cell_type": "raw",
"id": "c2923bd1",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"sidebar_label: AI21 Labs\n",
"sidebar_label: AI21\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "cc3c6ef6bbd57ce9",
"metadata": {
"collapsed": false
},
"id": "9a3d6f34",
"metadata": {},
"source": [
"# AI21Embeddings\n",
"\n",
"This notebook covers how to get started with AI21 embedding models.\n",
"This will help you get started with AI21 embedding models using LangChain. For detailed documentation on `AI21Embeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_ai21.embeddings.AI21Embeddings.html).\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"import { ItemTable } from \"@theme/FeatureTables\";\n",
"\n",
"<ItemTable category=\"text_embedding\" item=\"AI21\" />\n",
"\n",
"## Installation"
"## Setup\n",
"\n",
"To access AI21 embedding models you'll need to create an AI21 account, get an API key, and install the `langchain-ai21` integration package.\n",
"\n",
"### Credentials\n",
"\n",
"Head to [https://docs.ai21.com/](https://docs.ai21.com/) to sign up to AI21 and generate an API key. Once you've done this set the `AI21_API_KEY` environment variable:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4c3bef91",
"execution_count": 2,
"id": "36521c2a",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"if not os.getenv(\"AI21_API_KEY\"):\n",
" os.environ[\"AI21_API_KEY\"] = getpass.getpass(\"Enter your AI21 API key: \")"
]
},
{
"cell_type": "markdown",
"id": "c84fb993",
"metadata": {},
"source": [
"If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "39a4953b",
"metadata": {},
"outputs": [],
"source": [
"!pip install -qU langchain-ai21"
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")"
]
},
{
"cell_type": "markdown",
"id": "2b4f3e15",
"id": "d9664366",
"metadata": {},
"source": [
"## Environment Setup\n",
"### Installation\n",
"\n",
"We'll need to get a [AI21 API key](https://docs.ai21.com/) and set the `AI21_API_KEY` environment variable:\n"
"The LangChain AI21 integration lives in the `langchain-ai21` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62e0dbc3",
"metadata": {
"tags": []
},
"id": "64853226",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"AI21_API_KEY\"] = getpass()"
"%pip install -qU langchain-ai21"
]
},
{
"cell_type": "markdown",
"id": "74ef9d8b40a1319e",
"metadata": {
"collapsed": false
},
"id": "45dd1724",
"metadata": {},
"source": [
"## Usage"
"## Instantiation\n",
"\n",
"Now we can instantiate our model object and generate chat completions:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "12fcfb4b",
"execution_count": 7,
"id": "9ea7a09b",
"metadata": {},
"outputs": [],
"source": [
"from langchain_ai21 import AI21Embeddings\n",
"\n",
"embeddings = AI21Embeddings()"
"embeddings = AI21Embeddings(\n",
" # Can optionally increase or decrease the batch_size\n",
" # to improve latency.\n",
" # Use larger batch sizes with smaller documents, and\n",
" # smaller batch sizes with larger documents.\n",
" # batch_size=256,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "77d271b6",
"metadata": {},
"source": [
"## Indexing and Retrieval\n",
"\n",
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n",
"\n",
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f2e6104",
"execution_count": 8,
"id": "d817716b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'LangChain is the framework for building context-aware reasoning applications'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a vector store with a sample text\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"\n",
"text = \"LangChain is the framework for building context-aware reasoning applications\"\n",
"\n",
"vectorstore = InMemoryVectorStore.from_texts(\n",
" [text],\n",
" embedding=embeddings,\n",
")\n",
"\n",
"# Use the vectorstore as a retriever\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
"# Retrieve the most similar text\n",
"retrieved_documents = retriever.invoke(\"What is LangChain?\")\n",
"\n",
"# show the retrieved document's content\n",
"retrieved_documents[0].page_content"
]
},
{
"cell_type": "markdown",
"id": "e02b9855",
"metadata": {},
"outputs": [],
"source": [
"embeddings.embed_query(\"My query to look up\")"
"## Direct Usage\n",
"\n",
"Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n",
"\n",
"You can directly call these methods to get embeddings for your own use cases.\n",
"\n",
"### Embed single texts\n",
"\n",
"You can embed single texts or documents with `embed_query`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a3465d7e63bfb3d1",
"metadata": {
"collapsed": false
},
"outputs": [],
"execution_count": 9,
"id": "0d2befcd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0.01913362182676792, 0.004960147198289633, -0.01582135073840618, -0.042474791407585144, 0.040200788\n"
]
}
],
"source": [
"embeddings.embed_documents(\n",
" [\"This is a content of the document\", \"This is another document\"]\n",
")"
"single_vector = embeddings.embed_query(text)\n",
"print(str(single_vector)[:100]) # Show the first 100 characters of the vector"
]
},
{
"cell_type": "markdown",
"id": "1b5a7d03",
"metadata": {},
"source": [
"### Embed multiple texts\n",
"\n",
"You can embed multiple texts with `embed_documents`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9d60af6d",
"execution_count": 10,
"id": "2f4d6e97",
"metadata": {},
"outputs": [],
"source": []
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0.03029559925198555, 0.002908500377088785, -0.02700909972190857, -0.04616579785943031, 0.0382771529\n",
"[0.018214847892522812, 0.011460083536803722, -0.03329407051205635, -0.04951060563325882, 0.032756105\n"
]
}
],
"source": [
"text2 = (\n",
" \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n",
")\n",
"two_vectors = embeddings.embed_documents([text, text2])\n",
"for vector in two_vectors:\n",
" print(str(vector)[:100]) # Show the first 100 characters of the vector"
]
},
{
"cell_type": "markdown",
"id": "98785c12",
"metadata": {},
"source": [
"## API Reference\n",
"\n",
"For detailed documentation on `AI21Embeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_ai21.embeddings.AI21Embeddings.html).\n"
]
}
],
"metadata": {
Expand All @@ -130,7 +262,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.9.6"
}
},
"nbformat": 4,
Expand Down
Loading