Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize retriever integration docs #24908

Open
1 task done
ccurme opened this issue Jul 31, 2024 · 1 comment
Open
1 task done

Standardize retriever integration docs #24908

ccurme opened this issue Jul 31, 2024 · 1 comment
Labels
🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder help wanted Good issue for contributors integration-docs Ɑ: retriever Related to retriever module

Comments

@ccurme
Copy link
Collaborator

ccurme commented Jul 31, 2024

Privileged issue

  • I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.

Issue Content

Issue

To make our retriever integrations as easy to use as possible we need to make sure the docs for them are thorough and standardized. There are two parts to this: updating the retriever docstrings and updating the actual integration docs.

This needs to be done for each retriever integration, ideally with one PR per retriever.

Related to broader issues #21983 and #22005.

Docstrings

Each retriever class docstring should have the sections shown in the Appendix below. The sections should have input and output code blocks when relevant.

To build a preview of the API docs for the package you're working on run (from root of repo):

make api_docs_clean; make api_docs_quick_preview API_PKG=community

where API_PKG= should be the parent directory that houses the edited package (e.g. "community" for langchain-community).

Doc pages

Each retriever docs page should follow this template.

See example here.

You can use the langchain-cli to quickly get started with a new integration docs page (run from root of repo):

poetry run pip install -e libs/cli
poetry run langchain-cli integration create-doc --name "foo-bar" --name-class FooBar --component-type Retriever --destination-dir ./docs/docs/integrations/retrievers/

where --name is the integration package name without the "langchain-" prefix and --name-class is the class name without the "Retriever" postfix. This will create a template doc with some autopopulated fields at docs/docs/integrations/retrievers/foo_bar.ipynb.

To build a preview of the docs you can run (from root):

make docs_clean
make docs_build
cd docs/build/output-new
yarn
yarn start

Appendix

Expected sections for the retriever class docstring.

    """__ModuleName__ retriever.

    # TODO: Replace with relevant packages, env vars, etc.
    Setup:
        Install ``__package_name__`` and set environment variable ``__MODULE_NAME___API_KEY``.

        .. code-block:: bash

            pip install -U __package_name__
            export __MODULE_NAME___API_KEY="your-api-key"

    # TODO: Populate with relevant params.
    Key init args:
        arg 1: type
            description
        arg 2: type
            description

    # TODO: Replace with relevant init params.
    Instantiate:
        .. code-block:: python

            from __package_name__ import __ModuleName__Retriever

            retriever = __ModuleName__Retriever(
                # ...
            )

    Usage:
        .. code-block:: python

            query = "..."

            retriever.invoke(query)

        .. code-block:: python

            # TODO: Example output.

    Use within a chain:
        .. code-block:: python

            from langchain_core.output_parsers import StrOutputParser
            from langchain_core.prompts import ChatPromptTemplate
            from langchain_core.runnables import RunnablePassthrough
            from langchain_openai import ChatOpenAI

            prompt = ChatPromptTemplate.from_template(
                \"\"\"Answer the question based only on the context provided.

            Context: {context}

            Question: {question}\"\"\"
            )

            llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

            def format_docs(docs):
                return "\n\n".join(doc.page_content for doc in docs)

            chain = (
                {"context": retriever | format_docs, "question": RunnablePassthrough()}
                | prompt
                | llm
                | StrOutputParser()
            )

            chain.invoke("...")

        .. code-block:: python

             # TODO: Example output.

    """  # noqa: E501

See example here.

@ccurme ccurme added the help wanted Good issue for contributors label Jul 31, 2024
@dosubot dosubot bot added Ɑ: retriever Related to retriever module 🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder labels Jul 31, 2024
ccurme added a commit that referenced this issue Aug 2, 2024
Dear langchain maintainers, 

I add the wikipedia integration docs according to the [web
docs](https://python.langchain.com/v0.2/docs/integrations/retrievers/wikipedia/),
and follow the format of [tavily
example](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/tavily.ipynb)
and [retriever
template](https://github.com/langchain-ai/langchain/blob/master/libs/cli/langchain_cli/integration_template/docs/retrievers.ipynb),
this is my first time contributing large repo. please let me know if I'm
doing anything wrong, thank you!

Topic related: #24908

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
olgamurraft pushed a commit to olgamurraft/langchain that referenced this issue Aug 16, 2024
Dear langchain maintainers, 

I add the wikipedia integration docs according to the [web
docs](https://python.langchain.com/v0.2/docs/integrations/retrievers/wikipedia/),
and follow the format of [tavily
example](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/tavily.ipynb)
and [retriever
template](https://github.com/langchain-ai/langchain/blob/master/libs/cli/langchain_cli/integration_template/docs/retrievers.ipynb),
this is my first time contributing large repo. please let me know if I'm
doing anything wrong, thank you!

Topic related: langchain-ai#24908

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
@ccurme ccurme added investigate Flagged for investigation. integration-docs and removed investigate Flagged for investigation. labels Aug 16, 2024
@ENUMERA8OR
Copy link

@ccurme Would like to contribute to this issue.

cjumel added a commit to LinkupPlatform/langchain-linkup that referenced this issue Nov 27, 2024
cjumel added a commit to LinkupPlatform/langchain-linkup that referenced this issue Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder help wanted Good issue for contributors integration-docs Ɑ: retriever Related to retriever module
Projects
None yet
Development

No branches or pull requests

2 participants