Skip to content

kpstsp/pdfquest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDFQuest - Comprehensive PDF Toolkit

A Swiss Army knife for working with PDF files - your all-in-one solution for PDF manipulation and analysis.

Overview

PDFQuest is a comprehensive PDF toolkit designed to simplify common PDF operations. It's built modularly to allow easy addition of new features. Currently, it provides powerful tools for splitting and managing PDF documents, with plans for extensive expansion.

Current & Planned Features

  • PDF Splitting (✓ Available) - Split PDF files into multiple parts
  • Image Extraction (Planned) - Extract images from PDF documents
  • Metadata Management (Planned) - Read, edit, and modify PDF metadata
  • Resolution Control (Planned) - Change and optimize PDF resolution
  • Page Manipulation (Planned) - Rotate, crop, merge, and reorder pages
  • Text Extraction (Planned) - Extract text content from PDFs
  • Batch Processing (Planned) - Process multiple files at once

Requirements

  • Python 3.x
  • PyPDF2

Installation

pip install PyPDF2

Current Module: PDF Splitter

Usage

python split2pages.py --file <path_to_file.pdf> [--parts <number_of_parts>]

Arguments

  • --file (required) - Path to the PDF file to split
  • --parts (optional) - Number of parts to split the document into (default: 2)

Examples

Split a document into 2 parts:

python split2pages.py --file document.pdf

Split a document into 4 parts:

python split2pages.py --file document.pdf --parts 4

How It Works

  1. Opens the specified PDF file
  2. Determines the total number of pages
  3. Calculates the size of each part
  4. Distributes pages evenly across parts
  5. Creates new PDF files named <original_name>-<part_number>.pdf

TODO

  • Implement image extraction module
  • Implement metadata management module
  • Implement resolution control/optimization
  • Add page manipulation features (rotate, crop, merge, reorder)
  • Implement text extraction capability
  • Add batch processing support
  • Create unified CLI interface for all modules
  • Add comprehensive error handling
  • Write unit tests
  • Optimize performance for large PDF files
  • Create comprehensive documentation
  • Add support for PDF compression
  • Implement watermark functionality

License

BSD 2-Clause License

Copyright (c) 2022, Konstantin Parashchevin

See LICENSE for details.

About

pdf utilities

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages