Unlock the power of your documents with our Python-based AI Document Processing Pipeline. Combining the precision of Google Cloud Vision OCR with the intelligence of PortKey AI, this system delivers accurate text extraction and advanced analysis, transforming unstructured data into actionable insights.
- High-Accuracy OCR: Leverage Google Cloud Vision API to extract text from PDF, Images, etc of documents.
- Intelligent Text Analysis: Leverage PortKey AI for advanced Prompt Management, seamless experimentation with multiple LLM models, and comprehensive Cost Analysis to optimize AI-driven workflows.
- Scalable and Flexible: Designed to handle diverse document types and volumes, ensuring seamless integration into your workflow.
- Robust Data Validation: Pydantic ensures data integrity and type safety, minimizing errors and inconsistencies.
- Comprehensive Logging and Monitoring: LogFire provides real-time insights, error tracking, and performance monitoring.
- Upload Your Documents: Place your PDFs in Google Cloud Storage can extend it to upload via API as well.
- Automated OCR Processing: The pipeline retrieves documents and performs OCR using Google Cloud Vision API.
- Advanced Text Analysis: PortKey AI processes the extracted text using cutting-edge LLM models and Dynamic Prompts, enabling the generation of actionable insights and empowering data-driven decision-making.
- Structured JSON Output: Receive the processed data in a clean, structured JSON format, ready for further use.
-
Install Dependencies
make install
-
Configure Google Cloud
- Enable Google Cloud Vision API.
- Set up a service account and download credentials.
- Create a storage bucket for your documents.
export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/credentials.json"
-
Set Environment Variables Create a
.envfile with your API keys and paths:PORTKEY_API_KEY=your_portkey_api_key INPUT_DOCUMENT_PATH=gs://your-bucket/input.pdf OUTPUT_DOCUMENT_PATH=gs://your-bucket/output/ LOGFIRE_TOKEN=your_logfire_token PORTKEY_PROMPT_ID=your_prompt_id
-
Upload Your Document
gsutil cp your_document.pdf gs://your-bucket/
-
Execute the Processing
make run
.
├── main.py # The heart of the application
├── requirements.txt # Manage Python dependencies effortlessly
├── Makefile # Simplify build and run operations
└── .env # Securely store environment variables
Document (PDF) → Google Cloud Storage → Vision API OCR
→ Text Extraction → PortKey AI Processing → Structured JSON Output
- Code Formatting
make format
- Run Tests
make test - Code Linting
make lint
- Google Cloud Vision Documentation
- PortKey AI Documentation
- Pydantic Documentation
- LogFire Documenation
MIT
