A Swiss Army knife for working with PDF files - your all-in-one solution for PDF manipulation and analysis.
PDFQuest is a comprehensive PDF toolkit designed to simplify common PDF operations. It's built modularly to allow easy addition of new features. Currently, it provides powerful tools for splitting and managing PDF documents, with plans for extensive expansion.
- PDF Splitting (✓ Available) - Split PDF files into multiple parts
- Image Extraction (Planned) - Extract images from PDF documents
- Metadata Management (Planned) - Read, edit, and modify PDF metadata
- Resolution Control (Planned) - Change and optimize PDF resolution
- Page Manipulation (Planned) - Rotate, crop, merge, and reorder pages
- Text Extraction (Planned) - Extract text content from PDFs
- Batch Processing (Planned) - Process multiple files at once
- Python 3.x
- PyPDF2
pip install PyPDF2python split2pages.py --file <path_to_file.pdf> [--parts <number_of_parts>]--file(required) - Path to the PDF file to split--parts(optional) - Number of parts to split the document into (default: 2)
Split a document into 2 parts:
python split2pages.py --file document.pdfSplit a document into 4 parts:
python split2pages.py --file document.pdf --parts 4- Opens the specified PDF file
- Determines the total number of pages
- Calculates the size of each part
- Distributes pages evenly across parts
- Creates new PDF files named
<original_name>-<part_number>.pdf
- Implement image extraction module
- Implement metadata management module
- Implement resolution control/optimization
- Add page manipulation features (rotate, crop, merge, reorder)
- Implement text extraction capability
- Add batch processing support
- Create unified CLI interface for all modules
- Add comprehensive error handling
- Write unit tests
- Optimize performance for large PDF files
- Create comprehensive documentation
- Add support for PDF compression
- Implement watermark functionality
BSD 2-Clause License
Copyright (c) 2022, Konstantin Parashchevin
See LICENSE for details.