Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
-
Updated
Dec 8, 2025 - Python
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取
Advanced PDF processing, OCR, and AI vision analysis nodes for ComfyUI. Extract images from PDFs, perform multilingual OCR with Surya, detect objects with Florence-2, and analyze document layouts.
Python data pipeline for arXiv metadata: SQLAlchemy + Alembic schema, PostgreSQL storage, PDF download tracking, and optional PaddleOCR processing.
Истории автомобилей
Native Swift + MLX port of Paddle0CR-VL, a 0.9B doc parsing VLM for OCR, tables, formulas, and charts on Apple Silicon
Add a description, image, and links to the paddleocr-vl topic page so that developers can more easily learn about it.
To associate your repository with the paddleocr-vl topic, visit your repo's landing page and select "manage topics."