A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.
-
Updated
Jul 14, 2023 - Python
A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.
This Python-based tool allows for efficient comparison of two or more PDF documents, highlighting the differences between them. It extracts and compares the words in the PDFs, ignoring whitespace differences, and highlights the changed, added, or missing words.
PDF Viewer with Dark Mode ( fitz + PyQt6 )
Automated PDF translation & redaction with OCR, PyMuPDF, and AI translation. Preserves layout, font, and colors while supporting selective redaction/masking. English ↔ Hindi supported out-of-the-box. Docker-ready.
This Python script provides a graphical user interface (GUI) to extract a custom polygonal area from every page of a PDF document
Converts messy PDF documents into clean, hierarchical outlines with headings (H1, H2, H3) and their page numbers in beautiful JSON format.
A chatbot made using RAG to answer questions regarding dataengineering/sql/mlflow
Our streamlined tool extracts high-quality images from your PDFs and compiles them into a convenient ZIP file. Get started today and bring your documents to life with ease!
Python PDF-to-HTML Converter: Transforming PDF Documents into Structured HTML Tags. - Feb 2022 - Jun 2023
This application facilitates the comparison of two PDF files. Differences are presented in a table, color-coded as red (deletions), green (additions), and orange (moved text). Users can save the results in Excel format. It is designed to check whether annotations have been taken into account during the comparison process.
Automatically create bookmarks from "table of content" for *.pdf books
I use spaCy, pandas, fitz, and re packages to extract key insights such as chunks of sentences with percentages and context.
Add a description, image, and links to the fitz topic page so that developers can more easily learn about it.
To associate your repository with the fitz topic, visit your repo's landing page and select "manage topics."