Automated tool for filling insurance claim forms and other PDF documents. This tool analyzes PDF forms, detects field locations, and fills them with provided data.
- Automatic Field Detection: Uses computer vision to detect form fields in PDFs
- Template Management: Save and reuse field templates for consistent form filling
- Multiple Filling Methods: Choose between image overlay or ReportLab methods
- Google Drive Integration: Store and retrieve forms from Google Drive
- Batch Processing: Fill multiple forms with different data sets
- Command Line Interface: Easy-to-use CLI for all operations
- Install system dependencies:
sudo apt-get update
sudo apt-get install poppler-utils tesseract-ocr- Install Python packages:
cd pdfform
pip install -r requirements.txt- Make the CLI executable:
chmod +x pdfform.pyFirst, analyze your PDF to detect fields and create a template:
./pdfform.py analyze path/to/form.pdf -o templates/This creates a template file with detected field coordinates.
Review and name the detected fields:
./pdfform.py edit-template templates/form_template.jsonGenerate a data template from your form template:
./pdfform.py create-data templates/form_template.json -o data/my_data.jsonEdit the JSON file to add your actual data.
Fill the PDF with your data:
./pdfform.py fill path/to/form.pdf templates/form_template.json data/my_data.json -o output/filled_form.pdf./pdfform.py quick-fill form.pdf data.json --auto-detectfor data in data/*.json; do
output="output/$(basename $data .json)_filled.pdf"
./pdfform.py fill form.pdf template.json "$data" -o "$output"
doneUpload forms to Google Drive:
./pdfform.py drive upload output/filled_form.pdf --folder "Insurance_Claims"List forms in Drive:
./pdfform.py drive list --folder "Insurance_Claims"Download a form:
./pdfform.py drive download FILE_ID output/downloaded_form.pdfFor Google Drive integration, place your service account credentials in one of:
~/.config/gcloud/application_default_credentials.json~/google_credentials.jsonconfig/google_credentials.json
analyze- Analyze PDF and extract field coordinatesfill- Fill PDF using template and dataquick-fill- Quick fill with optional auto-detectioncreate-data- Create data template from form templateedit-template- Interactively edit field templatedrive upload- Upload PDF to Google Drivedrive download- Download PDF from Google Drivedrive list- List PDFs in Google Drivesetup- Check dependencies and configuration
-
Initial Setup (one-time per form type):
./pdfform.py analyze insurance_claim_form.pdf ./pdfform.py edit-template templates/insurance_claim_form_template.json
-
Monthly Claims:
# Create data file for this month cp data/sample_insurance_data.json data/claim_2024_12.json # Edit with current month's data nano data/claim_2024_12.json # Fill the form ./pdfform.py fill insurance_claim_form.pdf templates/insurance_claim_form_template.json data/claim_2024_12.json -o output/claim_2024_12_filled.pdf # Upload to Drive ./pdfform.py drive upload output/claim_2024_12_filled.pdf
The tool supports different field types:
- text: Regular text fields (default)
- checkbox: For checkboxes (places X or checkmark)
- signature: For signature fields
Adjust detection parameters in the analyze command:
./pdfform.py analyze form.pdf --dpi 400 --interactiveHigher DPI provides better accuracy but slower processing.
- Fields not detected: Try higher DPI or use interactive mode
- Text misaligned: Adjust coordinates in template JSON
- Google Drive errors: Check credentials and permissions
- Missing dependencies: Run
./pdfform.py setupto check
- All processing is done locally on your machine
- Google Drive integration is optional
- No data is sent to external services