Skip to content

Standardize retriever integration docs #24908

Closed as not planned
Closed as not planned
@ccurme

Description

@ccurme

Privileged issue

  • I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.

Issue Content

Issue

To make our retriever integrations as easy to use as possible we need to make sure the docs for them are thorough and standardized. There are two parts to this: updating the retriever docstrings and updating the actual integration docs.

This needs to be done for each retriever integration, ideally with one PR per retriever.

Related to broader issues #21983 and #22005.

Docstrings

Each retriever class docstring should have the sections shown in the Appendix below. The sections should have input and output code blocks when relevant.

To build a preview of the API docs for the package you're working on run (from root of repo):

make api_docs_clean; make api_docs_quick_preview API_PKG=community

where API_PKG= should be the parent directory that houses the edited package (e.g. "community" for langchain-community).

Doc pages

Each retriever docs page should follow this template.

See example here.

You can use the langchain-cli to quickly get started with a new integration docs page (run from root of repo):

poetry run pip install -e libs/cli
poetry run langchain-cli integration create-doc --name "foo-bar" --name-class FooBar --component-type Retriever --destination-dir ./docs/docs/integrations/retrievers/

where --name is the integration package name without the "langchain-" prefix and --name-class is the class name without the "Retriever" postfix. This will create a template doc with some autopopulated fields at docs/docs/integrations/retrievers/foo_bar.ipynb.

To build a preview of the docs you can run (from root):

make docs_clean
make docs_build
cd docs/build/output-new
yarn
yarn start

Appendix

Expected sections for the retriever class docstring.

    """__ModuleName__ retriever.

    # TODO: Replace with relevant packages, env vars, etc.
    Setup:
        Install ``__package_name__`` and set environment variable ``__MODULE_NAME___API_KEY``.

        .. code-block:: bash

            pip install -U __package_name__
            export __MODULE_NAME___API_KEY="your-api-key"

    # TODO: Populate with relevant params.
    Key init args:
        arg 1: type
            description
        arg 2: type
            description

    # TODO: Replace with relevant init params.
    Instantiate:
        .. code-block:: python

            from __package_name__ import __ModuleName__Retriever

            retriever = __ModuleName__Retriever(
                # ...
            )

    Usage:
        .. code-block:: python

            query = "..."

            retriever.invoke(query)

        .. code-block:: python

            # TODO: Example output.

    Use within a chain:
        .. code-block:: python

            from langchain_core.output_parsers import StrOutputParser
            from langchain_core.prompts import ChatPromptTemplate
            from langchain_core.runnables import RunnablePassthrough
            from langchain_openai import ChatOpenAI

            prompt = ChatPromptTemplate.from_template(
                \"\"\"Answer the question based only on the context provided.

            Context: {context}

            Question: {question}\"\"\"
            )

            llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

            def format_docs(docs):
                return "\n\n".join(doc.page_content for doc in docs)

            chain = (
                {"context": retriever | format_docs, "question": RunnablePassthrough()}
                | prompt
                | llm
                | StrOutputParser()
            )

            chain.invoke("...")

        .. code-block:: python

             # TODO: Example output.

    """  # noqa: E501

See example here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedGood issue for contributorsintegration-docsⱭ: retrieverRelated to retriever module🤖:docsChanges to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions