Production-ready API service for document layout analysis, OCR, and semantic chunking.
Convert PDFs, PPTs, Word docs & images into RAG/LLM-ready chunks.
Layout Analysis | OCR + Bounding Boxes | Structured HTML and markdown | VLM Processing controls
Try it out!
·
Report Bug
·
Contact
- Table of Contents
- (Super) Quick Start
- Documentation
- Self-Hosted Deployment Options
- Licensing
- Connect With Us
- Go to chunkr.ai
- Make an account and copy your API key
- Install our Python SDK:
pip install chunkr-ai
- Use the SDK to process your documents:
from chunkr_ai import Chunkr # Initialize with your API key from chunkr.ai chunkr = Chunkr(api_key="your_api_key") # Upload a document (URL or local file path) url = "https://chunkr-web.s3.us-east-1.amazonaws.com/landing_page/input/science.pdf" task = chunkr.upload(url) # Export results in various formats task.html(output_file="output.html") task.markdown(output_file="output.md") task.content(output_file="output.txt") task.json(output_file="output.json") # Clean up chunkr.close()
Visit our docs for more information and examples.
-
Prerequisites:
- Docker and Docker Compose
- NVIDIA Container Toolkit (for GPU support, optional)
-
Clone the repo:
git clone https://github.com/lumina-ai-inc/chunkr
cd chunkr
- Set up environment variables:
# Copy the example environment file
cp .env.example .env
# Configure your environment variables
# Required: LLM_KEY as your OpenAI API key
- Start the services:
With GPU:
docker compose up -d
- Access the services:
- Web UI:
http://localhost:5173
- API:
http://localhost:8000
- Web UI:
Note: Requires an NVIDIA CUDA GPU
- Stop the services when done:
docker compose down
For production environments, we provide a Helm chart and detailed deployment instructions:
- See our detailed guide at
kube/README.md
- Includes configurations for high availability and scaling
For enterprise support and deployment assistance, contact us.
The core of this project is dual-licensed:
- GNU Affero General Public License v3.0 (AGPL-3.0)
- Commercial License
To use Chunkr without complying with the AGPL-3.0 license terms you can contact us or visit our website.
- 📧 Email: mehul@lumina.sh
- 📅 Schedule a call: Book a 30-minute meeting
- 🌐 Visit our website: chunkr.ai