Description
Privileged issue
- I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.
Issue Content
Issue
To make our retriever integrations as easy to use as possible we need to make sure the docs for them are thorough and standardized. There are two parts to this: updating the retriever docstrings and updating the actual integration docs.
This needs to be done for each retriever integration, ideally with one PR per retriever.
Related to broader issues #21983 and #22005.
Docstrings
Each retriever class docstring should have the sections shown in the Appendix below. The sections should have input and output code blocks when relevant.
To build a preview of the API docs for the package you're working on run (from root of repo):
make api_docs_clean; make api_docs_quick_preview API_PKG=community
where API_PKG=
should be the parent directory that houses the edited package (e.g. "community" for langchain-community
).
Doc pages
Each retriever docs page should follow this template.
See example here.
You can use the langchain-cli
to quickly get started with a new integration docs page (run from root of repo):
poetry run pip install -e libs/cli
poetry run langchain-cli integration create-doc --name "foo-bar" --name-class FooBar --component-type Retriever --destination-dir ./docs/docs/integrations/retrievers/
where --name
is the integration package name without the "langchain-" prefix and --name-class
is the class name without the "Retriever" postfix. This will create a template doc with some autopopulated fields at docs/docs/integrations/retrievers/foo_bar.ipynb.
To build a preview of the docs you can run (from root):
make docs_clean
make docs_build
cd docs/build/output-new
yarn
yarn start
Appendix
Expected sections for the retriever class docstring.
"""__ModuleName__ retriever.
# TODO: Replace with relevant packages, env vars, etc.
Setup:
Install ``__package_name__`` and set environment variable ``__MODULE_NAME___API_KEY``.
.. code-block:: bash
pip install -U __package_name__
export __MODULE_NAME___API_KEY="your-api-key"
# TODO: Populate with relevant params.
Key init args:
arg 1: type
description
arg 2: type
description
# TODO: Replace with relevant init params.
Instantiate:
.. code-block:: python
from __package_name__ import __ModuleName__Retriever
retriever = __ModuleName__Retriever(
# ...
)
Usage:
.. code-block:: python
query = "..."
retriever.invoke(query)
.. code-block:: python
# TODO: Example output.
Use within a chain:
.. code-block:: python
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
prompt = ChatPromptTemplate.from_template(
\"\"\"Answer the question based only on the context provided.
Context: {context}
Question: {question}\"\"\"
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
chain.invoke("...")
.. code-block:: python
# TODO: Example output.
""" # noqa: E501
See example here.