OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
-
Updated
Jun 23, 2025 - Python
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
A Python wrapper for the tesseract-ocr API
Python tool for grabbing text via screenshot
Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking…
🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.
📋 Python wrapper to grab text from images and save as text files using Tesseract Engine
Extract tables from scanned image PDFs using Optical Character Recognition.
Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)
AWS Lambda functions to extract text from various binary formats.
📲 Bot to help solve HQ trivia
Extract text information from Aadhaar Card using tesseract-ocr 😎
Add a description, image, and links to the tesseract topic page so that developers can more easily learn about it.
To associate your repository with the tesseract topic, visit your repo's landing page and select "manage topics."