Skip to content

A well-organized project that can parse invoices from PDF/images AND chat with them using LLM.

Amanbig/chatInvoice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Complete Invoice Chatbot

A well-organized project that can parse invoices from PDF/images AND chat with them using LLM.

Features

  • 📄 Parse invoices from PDF files and images (PNG, JPG, etc.)
  • 🔍 Extract data like vendor, invoice number, date, due date, total
  • 💬 Chat with invoices using natural language questions
  • 🤖 LLM integration with OpenAI, Anthropic, and other providers
  • 💾 Save/load invoice data to/from JSON files
  • 📁 Batch processing of multiple invoice files

Quick Start

  1. Install dependencies:

    pip install -r config/requirements.txt
  2. Set up API key:

    cp config/.env.example .env
    # Edit .env and add your OpenAI API key
  3. Run the chatbot:

    # Basic usage with sample data
    python main.py
    
    # Parse a single invoice file
    python main.py --parse-file invoice.pdf
    
    # Parse all invoices from a directory
    python main.py --parse-dir invoices/

Parsing Invoices

The script can parse invoices from PDF files and images:

Interactive Commands

# In the chatbot, use these commands:
parse invoice.pdf              # Parse a PDF file
parse invoice.png              # Parse an image file  
parse-dir invoices/            # Parse all files in directory
summary                        # Show all loaded invoices
save                          # Save invoices to JSON

Supported File Types

  • PDF files: .pdf
  • Image files: .png, .jpg, .jpeg, .tiff, .bmp

Extracted Fields

  • Vendor name
  • Invoice number
  • Invoice date
  • Due date
  • Total amount

🤖 LLM Providers

The chatbot supports multiple LLM providers:

Cloud Providers:

  • OpenAI - GPT-4, GPT-3.5-turbo (recommended)
  • Anthropic - Claude-3 Sonnet, Haiku
  • Google Gemini - Gemini Pro, Gemini Pro Vision
  • Mistral AI - Mistral Medium, Small, Tiny
  • Cohere - Command, Command Light

Local Providers:

  • Ollama - Llama2, Mistral, CodeLlama (runs locally)

Provider Management:

# Check provider status
providers

# Switch providers during chat
switch gemini
switch anthropic
switch ollama

# Install additional providers
python install_providers.py

Starting with Different Providers:

python main.py                    # Auto-detect provider
python main.py --provider gemini  # Use Google Gemini
python main.py --provider ollama  # Use local Ollama

Example Usage

The chatbot comes with 3 sample invoices and can answer questions like:

Q: How many invoices are due in the next 7 days?

A: 2 invoices are due in the next 7 days:
• Amazon (due September 5, 2025, $2,450.00)
• Microsoft (due September 10, 2025, $3,100.00)

Q: What is the total value of the invoice from Amazon?

A: The total value of the invoice from Amazon is $2,450.00. This invoice 
(INV-0012) is for AWS Cloud Services and is due on September 5, 2025.

Q: List all vendors with invoices > $2000

A: Vendors with invoices above $2,000.00:
• Amazon ($2,450.00)
• Microsoft ($3,100.00)

Sample Data

The system includes three sample invoices:

Vendor Invoice # Amount Due Date
Amazon INV-0012 $2,450.00 Sept 5, 2025
Microsoft INV-0043 $3,100.00 Sept 10, 2025
Google INV-0087 $1,850.00 Sept 15, 2025

Project Structure

├── main.py                     # 🚀 Main entry point - run this file
├── src/                        # Source code modules
│   ├── __init__.py
│   ├── invoice_chatbot.py      # Main chatbot logic
│   ├── invoice_parser.py       # PDF/image parsing
│   └── llm_provider.py         # LLM integration
├── config/                     # Configuration files
│   ├── requirements.txt        # Python dependencies
│   └── .env.example           # Environment variables template
├── docs/                       # Documentation
│   ├── DELIVERABLES.md        # Project deliverables
│   └── invoice_chatbot_demo.ipynb  # Jupyter demo
├── sample_invoices/           # Sample invoice files (created when needed)
├── invoices.json              # Parsed invoice data (created automatically)
└── README.md                  # This file

Adding Your Own Invoices

Edit invoices.json to add more invoices, or use the parsing commands:

# Parse files interactively
python main.py
> parse invoice.pdf
> parse-dir invoices/

# Or via command line
python main.py --parse-file invoice.pdf
python main.py --parse-dir invoices/

Supported Query Types

  • Due dates: "How many invoices are due in X days?"
  • Vendor totals: "What's the total from [vendor]?"
  • Amount filtering: "Which vendors have invoices over $X?"
  • General info: "Tell me about the [vendor] invoice"
  • Prioritization: "Which invoice should I pay first?"

Requirements

  • Python 3.7+
  • OpenAI API key (or Anthropic API key)
  • Internet connection for LLM API calls

Project Structure

├── invoice_chatbot.py          # Main chatbot script
├── invoice_parser.py           # Invoice parsing utilities
├── invoice_chatbot_demo.ipynb  # Jupyter notebook demo
├── sample_invoices.json        # Sample invoice data
├── .env.example                # Environment variables template
├── requirements.txt            # Python dependencies
└── README.md                   # This file

API Keys

Get your API key from one of these providers:

The chatbot will automatically detect which provider to use based on your API key.

Troubleshooting

No LLM provider available: Make sure you've set either OPENAI_API_KEY or ANTHROPIC_API_KEY in your .env file.

Import errors: Run pip install -r requirements.txt to install all dependencies.

API errors: Check that your API key is valid and you have sufficient credits.

About

A well-organized project that can parse invoices from PDF/images AND chat with them using LLM.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages