A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
-
Updated
Oct 28, 2025 - Python
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
Chunk smarter, not harder — built for LLMs, RAG pipelines, and beyond.
An intelligent chatbot that allows users to upload text-based Ayurveda PDFs and ask questions based on the content using RAG (Retrieval-Augmented Generation) combining semantic search and LLM-based responses.
Text splitting example using Tiktoken
Specialized markdown text splitter - part of LEDAA project's data ingestion pipeline for RAG.
An exploration of advanced text splitting strategies in LangChain for RAG, from basic character splitting to state-of-the-art semantic chunking.
Add a description, image, and links to the text-splitting topic page so that developers can more easily learn about it.
To associate your repository with the text-splitting topic, visit your repo's landing page and select "manage topics."