Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: exam of using a deprecated package in the "Retriever" tutorial #5743

Closed
2 tasks done
well-balanced opened this issue Jun 12, 2024 · 3 comments · Fixed by #5749
Closed
2 tasks done

DOC: exam of using a deprecated package in the "Retriever" tutorial #5743

well-balanced opened this issue Jun 12, 2024 · 3 comments · Fixed by #5749
Assignees
Labels
auto:documentation Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder

Comments

@well-balanced
Copy link

Checklist

  • I added a very descriptive title to this issue.
  • I included a link to the documentation page I am referring to (if applicable).

Issue with current documentation:

There is a code example of using a deprecated package(langchain/document_loaders/web/cheerio) in the Retriever tutorial part.

Below is the current version of the document

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const loader = new CheerioWebBaseLoader(
  "https://docs.smith.langchain.com/user_guide"
);
const rawDocs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
});
const docs = await splitter.splitDocuments(rawDocs);

const vectorstore = await MemoryVectorStore.fromDocuments(
  docs,
  new OpenAIEmbeddings()
);
const retriever = vectorstore.asRetriever();

const retrieverResult = await retriever.getRelevantDocuments(
  "how to upload a dataset"
);
console.log(retrieverResult[0]);

/*
  Document {
    pageContent: "your application progresses through the beta testing phase, it's essential to continue collecting data to refine and improve its performance. LangSmith enables you to add runs as examples to datasets (from both the project page and within an annotation queue), expanding your test coverage on real-world scenarios. This is a key benefit in having your logging system and your evaluation/testing system in the same platform.Production​Closely inspecting key data points, growing benchmarking datasets, annotating traces, and drilling down into important data in trace view are workflows you’ll also want to do once your app hits production. However, especially at the production stage, it’s crucial to get a high-level overview of application performance with respect to latency, cost, and feedback scores. This ensures that it's delivering desirable results at scale.Monitoring and A/B Testing​LangSmith provides monitoring charts that allow you to track key metrics over time. You can expand to",
    metadata: {
      source: 'https://docs.smith.langchain.com/user_guide',
      loc: { lines: [Object] }
    }
  }
*/

Idea or request for content:

Could the document be updated as follows?

New package path:

As it is now:

import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";

The updated version should be:

import { CheerioWebBaseLoader } from '@langchain/community/document_loaders/web/cheerio';

Using another method:

As it is now:

const retrieverResult = await retriever.getRelevantDocuments(
    "how to upload a dataset"
  );

The updated version should be:

const retrieverResult = await retriever.invoke('how to upload a dataset');"
@dosubot dosubot bot added the auto:documentation Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder label Jun 12, 2024
Copy link

dosubot bot commented Jun 12, 2024

Hello, @well-balanced! I'm here to assist you with any issues you may have. If you need help with bugs, have questions, or want to become a contributor, feel free to ask!

The "Retriever" tutorial documentation can indeed be updated to reflect the correct package path and method. Here is the revised version of the document:

Updated Code Example:

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const loader = new CheerioWebBaseLoader(
  "https://docs.smith.langchain.com/user_guide"
);
const rawDocs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
});
const docs = await splitter.splitDocuments(rawDocs);

const vectorstore = await MemoryVectorStore.fromDocuments(
  docs,
  new OpenAIEmbeddings()
);
const retriever = vectorstore.asRetriever();

const retrieverResult = await retriever.invoke('how to upload a dataset');
console.log(retrieverResult[0]);

/*
  Document {
    pageContent: "your application progresses through the beta testing phase, it's essential to continue collecting data to refine and improve its performance. LangSmith enables you to add runs as examples to datasets (from both the project page and within an annotation queue), expanding your test coverage on real-world scenarios. This is a key benefit in having your logging system and your evaluation/testing system in the same platform.Production​Closely inspecting key data points, growing benchmarking datasets, annotating traces, and drilling down into important data in trace view are workflows you’ll also want to do once your app hits production. However, especially at the production stage, it’s crucial to get a high-level overview of application performance with respect to latency, cost, and feedback scores. This ensures that it's delivering desirable results at scale.Monitoring and A/B Testing​LangSmith provides monitoring charts that allow you to track key metrics over time. You can expand to",
    metadata: {
      source: 'https://docs.smith.langchain.com/user_guide',
      loc: { lines: [Object] }
    }
  }
*/

This update ensures the correct package path and method are used, aligning with the latest changes in the LangChain JS repository [1][2].

To continue talking to Dosu, mention @dosu.

@well-balanced
Copy link
Author

@dosu thanks for the updated code example! Can I go ahead and open a PR to update the documentation with this revised version?

Copy link

dosubot bot commented Jun 12, 2024

Absolutely, feel free to open a PR with the updated documentation. Your contribution will be very helpful! If you need any further assistance, don't hesitate to ask.

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:documentation Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder
Projects
None yet
2 participants