AI-powered REST API + beautiful frontend to extract invoice data (number, date, supplier, amount, and more) from PDFs, scans, or smartphone photos.
Includes: Dockerfile, full project structure, HTML5/CSS frontend, Postman-ready endpoints, and setup guides. All in one ZIP.
A modern Flask REST API that turns ugly business paperwork into structured, ready-to-use JSON. Perfect for SaaS, internal tools, freelancers, accountants, integrators, and anyone who’s tired of manual data entry.
- 🖼 Works with PDF, JPG, PNG (scanned, camera, digital, whatever)
- 🔎 Extracts: invoice number, date, supplier, total (auto-detects in most layouts)
- 🌍 Multilanguage OCR (tesseract)
- 🧠 Smart text parsing (handles most weird invoice templates, even messy scans)
- ⚡ Lightning-fast: avg 1–3 seconds per invoice
- 🖥 Built-in beautiful HTML/CSS UI (drag & drop, mobile ready)
- 🚦 API and web front available on a single server — no CORS, no extra configs
- 🐳 Docker-ready & classic Python scripts
- 🔒 No cloud/3rd party: runs 100% locally, your docs never leave your PC
- 🧑💻 Easy to customize/extend for your business logic
POST /parse-invoice
Request:
multipart/form-data
with a file (file=...
) — PDF/JPG/PNG- (optional)
lang
— OCR language code (default: "eng")
Example using curl:
curl -X POST -F "file=@invoice.pdf" http://localhost:5000/parse-invoice
Response (200):
{
"parsed_fields": {
"invoice_date": "April 15, 2024",
"invoice_number": "INV-2024-117",
"supplier_name": "Widget Solutions",
"total_amount": "$750.00"
},
"raw_text": "INVOICE Invoice #\n\nINV-2024-117\nSupplier: Date: April 15, 2024\nWidget Solutions\n123 Industrial Park\nSpringfield, IL 62701\n..."
}
{"error": "No file part in the request"}
{"error": "Unsupported file type"}
{"error": "Text extraction failed: ..."}
Open http://localhost:5000/ — drag & drop your invoice, get structured results and raw OCR instantly.
- JSON response — formatted for devs
- Raw Text — human readable
- "Copy" and "Download" buttons for instant reuse
- Works on desktop/mobile, looks clean as hell 😎
pip install -r requirements.txt
- Flask
- pytesseract
- pdf2image
- Pillow
- flask-cors
- Flask-Limiter
- flasgger (optional, for API docs)
- python-magic-bin (Windows) / python-magic (Linux)
- tesseract-ocr (system dependency!)
docker build -t invoice-ocr-api .
docker run -p 5000:5000 invoice-ocr-api
python app.py
- ✅ API result
- ✅ Web frontend demo
- ✅ Error handling
- ✅ Real OCR with tricky invoices
See
/screens/
for live examples and raw data.
Get the full ZIP: project structure, Dockerfile, API + UI, and all the love:
- Email: talabov.ali72@gmail.com
- Telegram: @talabovali
Need this in Node.js, Go, or another stack? Custom integration? DM me — I'm ready for business.