Skip to content

Commit

Permalink
Update with a more concrete example
Browse files Browse the repository at this point in the history
  • Loading branch information
SkSirius committed Nov 12, 2024
1 parent 09d0e81 commit 4e3d9d5
Show file tree
Hide file tree
Showing 2 changed files with 77 additions and 121 deletions.
6 changes: 4 additions & 2 deletions docs/core_docs/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -218,14 +218,14 @@ docs/how_to/agent_executor.md
docs/how_to/agent_executor.mdx
docs/concepts/t.md
docs/concepts/t.mdx
docs/troubleshooting/errors/INVALID_TOOL_RESULTS.md
docs/troubleshooting/errors/INVALID_TOOL_RESULTS.mdx
docs/versions/migrating_memory/conversation_summary_memory.md
docs/versions/migrating_memory/conversation_summary_memory.mdx
docs/versions/migrating_memory/conversation_buffer_window_memory.md
docs/versions/migrating_memory/conversation_buffer_window_memory.mdx
docs/versions/migrating_memory/chat_history.md
docs/versions/migrating_memory/chat_history.mdx
docs/troubleshooting/errors/INVALID_TOOL_RESULTS.md
docs/troubleshooting/errors/INVALID_TOOL_RESULTS.mdx
docs/integrations/vectorstores/weaviate.md
docs/integrations/vectorstores/weaviate.mdx
docs/integrations/vectorstores/upstash.md
Expand Down Expand Up @@ -328,6 +328,8 @@ docs/integrations/llms/azure.md
docs/integrations/llms/azure.mdx
docs/integrations/llms/arcjet.md
docs/integrations/llms/arcjet.mdx
docs/integrations/chat/xai.md
docs/integrations/chat/xai.mdx
docs/integrations/chat/togetherai.md
docs/integrations/chat/togetherai.mdx
docs/integrations/chat/openai.md
Expand Down
192 changes: 73 additions & 119 deletions docs/core_docs/docs/tutorials/vectorstores_retrievers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -66,44 +66,33 @@ It has two attributes:
The `metadata` attribute can capture information about the source of the document, its relationship to other documents, and other information.
Note that an individual `Document` object often represents a chunk of a larger document.

Let's generate some sample documents:
To illustrate, we'll fetch product data from the Fake Store API and format each product as a Document:

```typescript
import { Document } from "@langchain/core/documents";

export default [
new Document({
pageContent:
"Dogs are great companions, known for their loyalty and friendliness.",
metadata: { source: "mammal-pets-doc" },
}),
new Document({
pageContent: "Cats are independent pets that often enjoy their own space.",
metadata: { source: "mammal-pets-doc" },
}),
new Document({
pageContent:
"Goldfish are popular pets for beginners, requiring relatively simple care.",
metadata: { source: "fish-pets-doc" },
}),
new Document({
pageContent:
"Parrots are intelligent birds capable of mimicking human speech.",
metadata: { source: "bird-pets-doc" },
}),
new Document({
pageContent:
"Rabbits are social animals that need plenty of space to hop around.",
metadata: { source: "mammal-pets-doc" },
}),
];
const fetchProducts = async () => {
const response = await fetch("https://fakestoreapi.com/products");
const data = await response.json();
return data.map((item: any) => {
return new Document({
pageContent: item.description,
metadata: {
title: item.title,
productId: item.id,
category: item.category,
rating: item.rating.rate,
},
});
});
};
```

**API Reference:** [Document](https://v03.api.js.langchain.com/classes/_langchain_core.documents.Document.html)

---

Here we've generated five documents, containing metadata indicating three distinct "sources".
Here, we've generated documents for products, adding metadata like title, product ID, category, and rating.

## Vector stores

Expand All @@ -122,16 +111,20 @@ To instantiate a vector store, we often need to provide an [embedding](../../doc
```typescript
import { OpenAIEmbeddings } from "@langchain/openai";
import { Chroma } from "@langchain/community/vectorstores/chroma";
import documents from "../data/documents";

const chromaConfig = {
url: "https://localhost:8000"
collectionName: "pets_collection",
};

const embeddings = new OpenAIEmbeddings()

const chroma = await Chroma.fromDocuments(documents, embeddings, chromaConfig);
const chromaConfig = {
collectionName: "product_reviews",
};

const run = async () => {
const documents = await fetchProducts();
const embeddings = new OpenAIEmbeddings();
const chroma = await Chroma.fromDocuments(
documents,
embeddings,
chromaConfig
);
};
```

**API Reference:** [OpenAIEmbeddings](https://v03.api.js.langchain.com/classes/_langchain_openai.OpenAIEmbeddings.html)
Expand All @@ -155,34 +148,26 @@ The methods will generally include a list of [Document](https://v03.api.js.langc
Return documents based on similarity to a string query:

```typescript
await chroma.similaritySearch("cat");
await chroma.similaritySearch("speed and durability");
```

```json
[
{
"pageContent": "Cats are independent pets that often enjoy their own space.",
"metadata": { "source": "mammal-pets-doc" }
},
{
"pageContent": "Dogs are great companions, known for their loyalty and friendliness.",
"metadata": { "source": "mammal-pets-doc" }
},
{
"pageContent": "Rabbits are social animals that need plenty of space to hop around.",
"metadata": { "source": "mammal-pets-doc" }
"pageContent": "3D NAND flash are applied to deliver high transfer speeds Remarkable transfer speeds that enable faster bootup and improved overall system performance. The advanced SLC Cache Technology allows performance boost and longer lifespan 7mm slim design suitable for Ultrabooks and Ultra-slim notebooks. Supports TRIM command, Garbage Collection technology, RAID, and ECC (Error Checking & Correction) to provide the optimized performance and enhanced reliability.",
"metadata": { "category": "electronics", "productId": 11, "rating": 4.8 }
},
{
"pageContent": "Parrots are intelligent birds capable of mimicking human speech.",
"metadata": { "source": "bird-pets-doc" }
"pageContent": "Expand your PS4 gaming experience, Play anywhere Fast and easy, setup Sleek design with high capacity, 3-year manufacturer's limited warranty",
"metadata": { "category": "electronics", "productId": 12, "rating": 4.8 }
}
]
```

Return scores:

```typescript
await chroma.similaritySearchWithScore("cat");
await chroma.similaritySearchWithScore("speed and durability");
```

```json
Expand All @@ -193,80 +178,37 @@ await chroma.similaritySearchWithScore("cat");
[
[
{
"pageContent": "Cats are independent pets that often enjoy their own space.",
"metadata": { "source": "mammal-pets-doc" }
},
0.3749317423763067
],
[
{
"pageContent": "Dogs are great companions, known for their loyalty and friendliness.",
"metadata": { "source": "mammal-pets-doc" }
},
0.483024756734972
],
[
{
"pageContent": "Rabbits are social animals that need plenty of space to hop around.",
"metadata": { "source": "mammal-pets-doc" }
"pageContent": "3D NAND flash are applied to deliver high transfer speeds Remarkable transfer speeds that enable faster bootup and improved overall system performance. The advanced SLC Cache Technology allows performance boost and longer lifespan 7mm slim design suitable for Ultrabooks and Ultra-slim notebooks. Supports TRIM command, Garbage Collection technology, RAID, and ECC (Error Checking & Correction) to provide the optimized performance and enhanced reliability.",
"metadata": { "category": "electronics", "productId": 11, "rating": 4.8 }
},
0.4958319828348823
0.3975359938454452
],
[
{
"pageContent": "Parrots are intelligent birds capable of mimicking human speech.",
"metadata": { "source": "bird-pets-doc" }
"pageContent": "Expand your PS4 gaming experience, Play anywhere Fast and easy, setup Sleek design with high capacity, 3-year manufacturer's limited warranty",
"metadata": { "category": "electronics", "productId": 12, "rating": 4.8 }
},
0.497523653562735
0.3983125833760192
]
]
```

Return documents based on similarity to an embedded query:

```typescript
const embedding = await new OpenAIEmbeddings().embedQuery("cat");
const embedding = await new OpenAIEmbeddings().embedQuery("durable drive");
const k = 1; // number of similar vectors to return

await chroma.similaritySearchVectorWithScore(embedding, k);
```

```json
[
[
{
"pageContent": "Cats are independent pets that often enjoy their own space.",
"metadata": { "source": "mammal-pets-doc" }
},
0.3749317423763067
],
[
{
"pageContent": "Dogs are great companions, known for their loyalty and friendliness.",
"metadata": { "source": "mammal-pets-doc" }
},
0.483024756734972
],
[
{
"pageContent": "Rabbits are social animals that need plenty of space to hop around.",
"metadata": { "source": "mammal-pets-doc" }
"pageContent": "3D NAND flash are applied to deliver high transfer speeds Remarkable transfer speeds that enable faster bootup and improved overall system performance. The advanced SLC Cache Technology allows performance boost and longer lifespan 7mm slim design suitable for Ultrabooks and Ultra-slim notebooks. Supports TRIM command, Garbage Collection technology, RAID, and ECC (Error Checking & Correction) to provide the optimized performance and enhanced reliability.",
"metadata": { "category": "electronics", "productId": 11, "rating": 4.8 }
},
0.4958319828348823
],
[
{
"pageContent": "Parrots are intelligent birds capable of mimicking human speech.",
"metadata": { "source": "bird-pets-doc" }
},
0.497523653562735
],
[
{
"pageContent": "Goldfish are popular pets for beginners, requiring relatively simple care.",
"metadata": { "source": "fish-pets-doc" }
},
0.5030592849950544
0.41179141608409436
]
]
```
Expand All @@ -292,7 +234,7 @@ const retriever = RunnableLambda.from(({ query, k }) =>
chroma.similaritySearch(query, k)
);

await retriever.invoke({ query: "cat", k: 1 });
await retriever.invoke({ query: "durable drive", k: 1 });
```

**API Reference:** [RunnableLambda](https://v03.api.js.langchain.com/classes/_langchain_core.runnables.RunnableLambda.html)
Expand All @@ -301,10 +243,20 @@ await retriever.invoke({ query: "cat", k: 1 });

```json
[
{
"pageContent": "Cats are independent pets that often enjoy their own space.",
"metadata": { "source": "mammal-pets-doc" }
}
[
{
"pageContent": "3D NAND flash are applied to deliver high transfer speeds Remarkable transfer speeds that enable faster bootup and improved overall system performance. The advanced SLC Cache Technology allows performance boost and longer lifespan 7mm slim design suitable for Ultrabooks and Ultra-slim notebooks. Supports TRIM command, Garbage Collection technology, RAID, and ECC (Error Checking & Correction) to provide the optimized performance and enhanced reliability.",
"metadata": { "category": "electronics", "productId": 11, "rating": 4.8 }
},
0.4118778959137068
],
[
{
"pageContent": "Expand your PS4 gaming experience, Play anywhere Fast and easy, setup Sleek design with high capacity, 3-year manufacturer's limited warranty",
"metadata": { "category": "electronics", "productId": 12, "rating": 4.8 }
},
0.41868317664456617
]
]
```

Expand All @@ -314,14 +266,14 @@ For instance, we can replicate the above with the following:

```typescript
const retriever = chroma.asRetriever({ searchType: "similarity", k: 1 });
await retriever.invoke("cat");
await retriever.invoke("durable laptop");
```

```json
[
{
"pageContent": "Cats are independent pets that often enjoy their own space.",
"metadata": { "source": "mammal-pets-doc" }
"pageContent": "3D NAND flash are applied to deliver high transfer speeds Remarkable transfer speeds that enable faster bootup and improved overall system performance. The advanced SLC Cache Technology allows performance boost and longer lifespan 7mm slim design suitable for Ultrabooks and Ultra-slim notebooks. Supports TRIM command, Garbage Collection technology, RAID, and ECC (Error Checking & Correction) to provide the optimized performance and enhanced reliability.",
"metadata": { "category": "electronics", "productId": 11, "rating": 4.8 }
}
]
```
Expand All @@ -337,10 +289,11 @@ import ChatModelTabs from "@theme/ChatModelTabs";
```

```typescript
import { ChatOpenAI } from "@langchain/core";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import {
RunnablePassthrough,
RunnableSequence,
RunnablePassthrough,
} from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";

Expand All @@ -350,17 +303,18 @@ const chatPrompt = ChatPromptTemplate.fromTemplate(`
Context: {context}
`);

const retriever = chroma.asRetriever();

const ragChain = RunnableSequence.from([
{
context: retriever,
question: new RunnablePassthrough(),
},
{ context: retriever, question: new RunnablePassthrough() },
chatPrompt,
llm,
new StringOutputParser(),
]);

const result = await ragChain.invoke("tell me about cats");
const result = await ragChain.invoke(
"I'm looking for a drive with a remarkable transfer speeds that enable faster bootup."
);
console.log(result);
```

Expand All @@ -369,7 +323,7 @@ console.log(result);
---

```
Cats are independent pets that often enjoy their own space.
Based on the provided context, the drive you are looking for has read/write speeds of up to 535MB/s/450MB/s, which will enable faster bootup, shutdown, application load, and response compared to a 5400 RPM SATA 2.5" hard drive.
```

## Learn more:
Expand Down

0 comments on commit 4e3d9d5

Please sign in to comment.