AI Document Processing Pipeline: Transform Your Document Workflow with Intelligent OCR and AI.

Elevate Your Document Processing with Cutting-Edge Technology

Unlock the power of your documents with our Python-based AI Document Processing Pipeline. Combining the precision of Google Cloud Vision OCR with the intelligence of PortKey AI, this system delivers accurate text extraction and advanced analysis, transforming unstructured data into actionable insights.

Key Features

High-Accuracy OCR: Leverage Google Cloud Vision API to extract text from PDF, Images, etc of documents.
Intelligent Text Analysis: Leverage PortKey AI for advanced Prompt Management, seamless experimentation with multiple LLM models, and comprehensive Cost Analysis to optimize AI-driven workflows.
Scalable and Flexible: Designed to handle diverse document types and volumes, ensuring seamless integration into your workflow.
Robust Data Validation: Pydantic ensures data integrity and type safety, minimizing errors and inconsistencies.
Comprehensive Logging and Monitoring: LogFire provides real-time insights, error tracking, and performance monitoring.

How It Works

Upload Your Documents: Place your PDFs in Google Cloud Storage can extend it to upload via API as well.
Automated OCR Processing: The pipeline retrieves documents and performs OCR using Google Cloud Vision API.
Advanced Text Analysis: PortKey AI processes the extracted text using cutting-edge LLM models and Dynamic Prompts, enabling the generation of actionable insights and empowering data-driven decision-making.
Structured JSON Output: Receive the processed data in a clean, structured JSON format, ready for further use.

Get Started

Setup and Configuration

Install Dependencies
```
make install
```
Configure Google Cloud
- Enable Google Cloud Vision API.
- Set up a service account and download credentials.
- Create a storage bucket for your documents.
```
export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/credentials.json"
```

Set Environment Variables Create a .env file with your API keys and paths:

PORTKEY_API_KEY=your_portkey_api_key
INPUT_DOCUMENT_PATH=gs://your-bucket/input.pdf
OUTPUT_DOCUMENT_PATH=gs://your-bucket/output/
LOGFIRE_TOKEN=your_logfire_token
PORTKEY_PROMPT_ID=your_prompt_id

Run the Pipeline

Upload Your Document

gsutil cp your_document.pdf gs://your-bucket/

Execute the Processing
```
make run
```

Project Structure

.
├── main.py                 # The heart of the application
├── requirements.txt        # Manage Python dependencies effortlessly
├── Makefile                # Simplify build and run operations
└── .env                    # Securely store environment variables

Flow Diagram

Document (PDF) → Google Cloud Storage → Vision API OCR 
    → Text Extraction → PortKey AI Processing → Structured JSON Output

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Document Processing Pipeline: Transform Your Document Workflow with Intelligent OCR and AI.

Elevate Your Document Processing with Cutting-Edge Technology

Key Features

How It Works

Get Started

Setup and Configuration

Run the Pipeline

Project Structure

Flow Diagram

Development and Maintenance

Resources

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Rapid292/ai-ocr

Folders and files

Latest commit

History

Repository files navigation

AI Document Processing Pipeline: Transform Your Document Workflow with Intelligent OCR and AI.

Elevate Your Document Processing with Cutting-Edge Technology

Key Features

How It Works

Get Started

Setup and Configuration

Run the Pipeline

Project Structure

Flow Diagram

Development and Maintenance

Resources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages