Skip to content

Evaluate LangChain MarkdownHeaderTextSplitter #36

Open
@bdb-dd

Description

Description

We still have a significant number of queries that are affected by the per document context length limitation.

LangChain has addressed this with a header preserving chunk implementation here:
https://github.com/finnless/langchain/blob/master/libs/langchain/langchain/text_splitter.py#L332

Test with and without nested header line ("Frontend v4 > Upgrade notes > Breaking changes")

Additional Information

No response

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions