Automated document processing pipeline using Google Document AI and Machine Learning
An end-to-end AI system that automates invoice processing for enterprises. Built to solve real business problems I've encountered in my ERP consulting career - where teams spend hours manually processing documents.
Business Impact:
- β‘ Reduces processing time from hours to seconds
- π― Achieves 95%+ accuracy in document classification
- π° Eliminates manual data entry errors
- π Processes 1000+ documents per hour
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β File Upload βββββΆ β Document AI βββββΆβ Classification β
β (PDF/Images) β β (OCR + Extract)β β (ML Model) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Cloud Storage β β PostgreSQL β β FastAPI β
β (GCS) β β (Results DB) β β (REST API) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
βββββββββββββββββββββββββΌββββββββββββββββββββββββ
βΌ
βββββββββββββββββββ
β Web Interface β
β (Streamlit) β
βββββββββββββββββββ
- π Multi-format Support: PDF, PNG, JPEG document processing
- π Smart OCR: Google Document AI for text extraction
- π€ ML Classification: Automated document type detection (Invoice/Receipt/PO)
- π Data Extraction: Key fields (amounts, dates, vendor info)
- β Validation: Business rule validation and error handling
- β‘ FastAPI Backend: RESTful API with automatic documentation
- π Web Interface: Clean, intuitive document upload interface
- π± Responsive Design: Works on desktop and mobile
- π Authentication: Secure file upload and processing
- ποΈ Scalable Architecture: Handles high document volumes
- π Monitoring: Processing metrics and error tracking
- π Batch Processing: Handle multiple documents simultaneously
- πΎ Data Persistence: Secure storage of processing results
Backend & AI:
- Python 3.9+ - Core language
- FastAPI - High-performance web framework
- Google Document AI - OCR and document understanding
- Scikit-learn - Machine learning classification
- Pandas & NumPy - Data processing
Database & Storage:
- PostgreSQL - Structured data storage
- Google Cloud Storage - Document file storage
- SQLAlchemy - Database ORM
Deployment & DevOps:
- Docker - Containerization
- Google Cloud Run - Serverless deployment
- GitHub Actions - CI/CD pipeline
- Poetry - Dependency management
Frontend:
- Streamlit - Interactive web interface
- Bootstrap - Responsive UI components
invoice-processing-ai/
βββ π src/
β βββ π api/ # FastAPI application
β β βββ main.py # API entry point
β β βββ routes/ # API endpoints
β β βββ middleware/ # Authentication, CORS
β βββ π core/ # Core business logic
β β βββ document_processor.py # Google Document AI
β β βββ classifier.py # ML classification
β β βββ validator.py # Business rule validation
β βββ π database/ # Database models and operations
β β βββ models.py # SQLAlchemy models
β β βββ crud.py # Database operations
β βββ π utils/ # Utility functions
β βββ config.py # Configuration management
β βββ logging.py # Logging setup
βββ π frontend/ # Streamlit web interface
β βββ app.py # Main Streamlit app
β βββ components/ # UI components
βββ π tests/ # Test suite
β βββ test_api.py # API tests
β βββ test_processing.py # Processing logic tests
β βββ fixtures/ # Test data
βββ π data/ # Sample data and models
β βββ sample_documents/ # Test documents
β βββ models/ # Trained ML models
βββ π scripts/ # Utility scripts
β βββ train_model.py # Model training
β βββ setup_db.py # Database initialization
βββ π docs/ # Documentation
β βββ api.md # API documentation
β βββ deployment.md # Deployment guide
βββ π docker/ # Docker configurations
β βββ Dockerfile.api # API container
β βββ Dockerfile.frontend # Frontend container
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Poetry configuration
βββ docker-compose.yml # Local development setup
βββ .github/workflows/ # CI/CD pipelines
- Python 3.9+
- Google Cloud Platform account
- Docker (optional, for containerized deployment)
- Clone the repository
git clone https://github.com/ypratap11/invoice-processing-ai.git
cd invoice-processing-ai
- Install dependencies
pip install -r requirements.txt
- Set up Google Cloud credentials
# Set up Document AI processor
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"
export GCP_PROJECT_ID="your-project-id"
export GCP_PROCESSOR_ID="your-processor-id"
- Initialize database
python scripts/setup_db.py
- Run the application
# Start API server
uvicorn src.api.main:app --reload
# Start frontend (in another terminal)
streamlit run frontend/app.py
docker-compose up --build
Metric | Target | Current |
---|---|---|
Document Classification Accuracy | >95% | π§ In Development |
Processing Time (per document) | <2 seconds | π§ In Development |
Throughput | 1000+ docs/hour | π§ In Development |
API Response Time | <500ms | π§ In Development |
- Project setup and architecture
- Google Document AI integration
- Basic ML classification model
- FastAPI backend implementation
- Simple web interface
- Advanced ML model with feature engineering
- Batch processing capabilities
- Comprehensive error handling
- API authentication and rate limiting
- Performance monitoring
- Multi-tenant support
- Advanced document types (contracts, statements)
- Real-time processing dashboard
- Integration APIs for ERP systems
- A/B testing framework
This is a portfolio project, but feedback and suggestions are welcome!
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
Built by Yeragudipati Pratap - Oracle ERP Expert --> AI/ML Engineering.
- πΌ LinkedIn: Connect with me
- π§ Email: ypratap114u@gmail.com
- π Portfolio: View more projects
β Star this repo if you find it helpful! β