#

docling

Here are 176 public repositories matching this topic...

ggozad / haiku.rag

Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling

ai mcp ml rag lancedb docling mcp-server pydantic-ai

Updated Jun 29, 2026
Python

LeDat98 / NexusRAG

Hybrid RAG system combining vector search, knowledge graph (LightRAG), and cross-encoder reranking — with Docling document parsing, visual intelligence (image/table captioning), agentic streaming chat, and inline citations. Powered by Gemini or local Ollama models.

react streaming citation gemini knowledge-graph knowledge-base reranking rag fastapi vector-search document-parsing chromadb retrieval-augmented-generation ollama docling lightrag

Updated Apr 20, 2026
Python

shoryasethia / markdrop

A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.

open-source pdf-to-text image-to-text marker agents pypi-package table-to-text markitdown llm pdf-to-markdown docling markdrop

Updated Mar 18, 2026
Python

genieincodebottle / parsemypdf

Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

ocr openai claude camelot pymupdf pypdf ocr-python markitdown gemini-pro gemini-ai llama-parse omniai unstructured-io docling llama-vision mistral-ocr smoldocling llama4

Updated Apr 21, 2026
Python

docling-project / docling-graph

Transform unstructured documents into validated, rich and queryable knowledge graphs.

ai convert knowledge-graph document-processing docling

Updated Jun 23, 2026
Python

stevereiner / flexible-graphrag

Python, LlamaIndex, LangChain, Docker Compose: 15 Property Graph, 4 RDF , 10 Vector, OpenSearch, Elasticsearch, Alfresco DBs. 13 data sources (9 auto-sync), KG auto-building, Ontologies, LLMs, Docling or LlamaParse doc processing, GraphRAG, RAG only, Hybrid Search, AI Chat. TypeScript React, Vue, Angular frontends, FastAPI REST backend, MCP Server.

Updated Jun 20, 2026
Python

AKSarav / pdfstract

PDFStract - Extract, Chunking and Embedding Layer in Your RAG Pipeline - Available as CLI - WEBUI - API

pdf ocr ai knowledgebase data-extraction chunking dataengineering unstructured rag rag-pipeline docling pdfconversion

Updated Mar 18, 2026
Python

GiovanniPasq / chunky

Open-source toolkit for reliable RAG pipelines: convert PDFs to Markdown, clean documents, inspect chunks, compare chunking strategies, and enrich metadata for LLM applications.

Updated Jun 6, 2026
Python

fahdmirza / doclingwithollama

Docling with Ollama - RAG on Local Files with Local Models

pdf-converter retrieval-augmented-generation ollama docling

Updated Jan 1, 2025
Python

NameetP / pdfmux

PDF extraction that checks its own work. #2 reading order accuracy — zero AI, zero GPU, zero cost.

python pdf ocr mcp self-healing structured-extraction rag pdf-to-json pdf-extraction ai-agent llm document-parsing pdf-to-markdown docling opendataloader

Updated Jun 29, 2026
Python

ghodsizadeh / pdf2csv

A python library and CLI tool to convert PDF files to CSV files.

pdf-converter pandas csv-export docling

Updated Jan 6, 2025
Python

HaileyTQuach / docchat-docling

DocChat is an AI-powered Multi-Agent RAG system using Docling for structured document parsing and BM25 + vector search retrievers to retrieve fact-checked answers from PDFs, DOCX, and text files, preventing hallucinations. 🚀

multiagent-systems gradio bm25 ai-agents rag llms chromadb react-agents langgraph docling

Updated Oct 6, 2025
Python

Harmeet10000 / AgentNexus-LangChain-FastAPI

Langchain FastAPI server

docker memory mcp gemini celery workflows rag pydantic fastapi tenacity langchain langsmith langgraph crawl4ai docling cognee

Updated Jun 22, 2026
Python

multi-agent-system

versionHQ / multi-agent-system

Autonomous agent networks for task automation that requires multi-step reasoning

orchestration-framework python3 networkx graph-theory matplotlib multi-agent-systems autonomous-agents pygraphviz self-directed-learning rag pydantic multi-step-reasoning langchain litellm agentic-ai composiotool mem0ai docling

Updated Sep 1, 2025
Python

LongParser

ENDEVSOLS / LongParser

Privacy-first document intelligence engine — parse PDFs, DOCX, PPTX, XLSX & CSV into AI-ready chunks for RAG pipelines. Includes HITL review, 3-layer memory chat, and a production FastAPI server.

python ocr parsing openai chunking human-in-the-loop pdf-parser rag fastapi vector-database document-intelligence llm document-parsing langchain retrieval-augmented-generation docling

Updated May 5, 2026
Python

serkanyasr / agentic_rag_project

Scalable Agentic RAG system using Pydantic AI, FastAPI & pgvector. Modular, production-ready foundation for document-based AI apps

openai agents rag fastapi agentic-rag docling

Updated Nov 3, 2025
Python

wzdavid / ThinkParse

Enterprise-grade document parsing service with asynchronous queue processing based on MinerU, Celery and Docker.

pdf document-parser markitdown mineru docling

Updated Jun 24, 2026
Python

daxueren666 / exam-review-helper

把 PDF/Word/TXT/Markdown 教材浓缩成交互式 HTML 复习文档的 Claude Code / Codex CLI Skill。自动识别文科/理工科模式，5-pass 深度提取，原生支持扫描版 PDF OCR。v1.2 用 pdfium 后端修复 docling std::bad_alloc 崩溃，大 PDF 稳定提取。

codex exam-review study-helper pdf-ocr docling agent-skills claude-code

Updated Jun 27, 2026
Python

felixdittrich92 / docling-OCR-OnnxTR

OnnxTR OCR plugin for Docling

ocr deep-learning text-recognition text-detection document-processing onnx onnxruntime docling onnxtr

Updated Jun 28, 2026
Python

gyunggyung / docling-translate

Advanced PDF/Document Translator with interactive comparison. Built on IBM Docling.

Updated Jan 5, 2026
Python

Improve this page

Add a description, image, and links to the docling topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the docling topic, visit your repo's landing page and select "manage topics."