Skip to content

Admonstrator/paperless-ai-patched

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸ“„ Paperless-AI Patched

GHCR Docker Pulls GitHub Stars License Upstream

⚠️ Community Integration Fork | All credit goes to clusterzx for the original Paperless-AI project.


Paperless-AI is an AI-powered extension for Paperless-ngx created by clusterzx that brings automatic document classification, smart tagging, and semantic search using OpenAI-compatible APIs and Ollama.

πŸ”§ About This Fork

This is a community-maintained integration fork that:

  • πŸ§ͺ Tests and merges pending upstream pull requests
  • πŸ“¦ Provides optimized Docker images (Lite & Full variants)
  • πŸ”’ Applies security updates and dependency maintenance
  • πŸ› Integrates community bug fixes
  • πŸ“ Offers additional documentation

Important: This fork exists purely for experimentation and integration testing. All development credit belongs to the original author. Think of this as a "tinkering workshop" where community fixes are tested before potentially flowing back upstream.

Want the official version? β†’ clusterzx/paperless-ai

πŸ†• What's Different in This Fork?

Performance Enhancements:

  • ⚑ PERF-001: History table with SQL pagination (~25-50x faster with 1000+ documents)
  • 🎯 Tag caching with 5-minute TTL (95% reduction in API calls)
  • πŸ”„ Force reload button to bypass cache when needed

Bug Fixes:

  • βœ… PR-772: Fixed infinite retry loop with exponential backoff
  • βœ… PR-747: History validation tool with real-time progress indicators
  • βœ… SSE buffering fix for instant progress feedback
  • βœ… Security: Authentication added to all history endpoints

Docker Optimizations:

  • πŸ“¦ DOCKER-001: Optimized images (Lite ~400MB, Full ~1.2GB)
  • πŸ—οΈ Separate build/push workflow for reliability
  • πŸ”§ Multi-stage builds with clean dependency installation

Documentation:

  • πŸ“š Comprehensive COPILOT.md for developers and AI assistants
  • πŸ“‹ Detailed fix documentation in Included_Fixes/
  • πŸ”— Swagger API documentation improvements

UI/UX:

  • πŸŒ“ Enhanced dark mode support
  • πŸ“± Responsive mobile menu
  • ⏱️ Real-time progress bars with step counts
  • 🎨 Improved loading indicators

πŸ“‹ Integrated Fixes

All fixes are documented in Included_Fixes/ with detailed implementation notes:

Category Fix ID Description Status
Upstream PRs PR-772 Fix infinite retry loop βœ… Merged
PR-747 History validation tool βœ… Merged
Performance PERF-001 SQL pagination & tag caching βœ… Applied
Security SEC-001 SSRF & code injection fixes βœ… Applied
Docker DOCKER-001 Optimized Docker images βœ… Applied
Dependencies DEP-001 Remove unused sqlite3 βœ… Applied
CI/CD CI-001 Automatic version tagging βœ… Applied

Full details: Included_Fixes/README.md


It enables fully automated document workflows, contextual chat, and powerful customization β€” all via an intuitive web interface.

πŸ’‘ Just ask:
β€œWhen did I sign my rental agreement?”
β€œWhat was the amount of the last electricity bill?”
β€œWhich documents mention my health insurance?”

Powered by Retrieval-Augmented Generation (RAG), you can now search semantically across your full archive and get precise, natural language answers.


✨ Features

πŸ”„ Automated Document Processing

  • Detects new documents in Paperless-ngx automatically
  • Analyzes content using OpenAI API, Ollama, and other compatible backends
  • Assigns title, tags, document type, and correspondent
  • Built-in support for:
    • Ollama (Mistral, Llama, Phi-3, Gemma-2)
    • OpenAI
    • DeepSeek.ai
    • OpenRouter.ai
    • Perplexity.ai
    • Together.ai
    • LiteLLM
    • VLLM
    • Fastchat
    • Gemini (Google)
    • ...and more!

🧠 RAG-Based AI Chat

  • Natural language document search and Q&A
  • Understands full document context (not just keywords)
  • Semantic memory powered by your own data
  • Fast, intelligent, privacy-friendly document queries
    RAG_CHAT_DEMO

βš™οΈ Manual Processing

  • Web interface for manual AI tagging
  • Useful when reviewing sensitive documents
  • Accessible via /manual

🧩 Smart Tagging & Rules

  • Define rules to limit which documents are processed
  • Disable prompts and apply tags automatically
  • Set custom output tags for tracked classification
    PPAI_SHOWCASE3

πŸš€ Installation

⚠️ First-time install: Restart the container after completing setup (API keys, preferences) to build RAG index.
πŸ” Not required for updates.

πŸ“˜ Installation Wiki

πŸ“¦ Container Images

Images are available on both GitHub Container Registry (GHCR) and Docker Hub:

Recommended: GitHub Container Registry (GHCR)

# Lite version (~500-700 MB) - AI tagging only
docker pull ghcr.io/admonstrator/paperless-ai-patched:latest

# Full version (~1.5-2 GB) - AI tagging + RAG search
docker pull ghcr.io/admonstrator/paperless-ai-patched:latest-full

Benefits of GHCR:

  • βœ… No rate limits
  • βœ… Unlimited bandwidth
  • βœ… Free for public repositories
  • βœ… Integrated with GitHub Actions

Alternative: Docker Hub

# Lite version
docker pull admonstrator/paperless-ai-patched:latest

# Full version
docker pull admonstrator/paperless-ai-patched:latest-full

🐳 Docker Support

  • Multi-stage optimized builds for smaller image sizes
  • Health monitoring and auto-restart
  • Persistent volumes and graceful shutdown
  • Works out of the box with minimal setup
  • Multi-arch support (amd64, arm64)

πŸ”§ Local Development

# Install dependencies
npm install

# Start development/test mode
npm run test

🧭 Roadmap Highlights

  • βœ… Multi-AI model support
  • βœ… Multilingual document analysis
  • βœ… Tag rules and filters
  • βœ… Integrated document chat with RAG
  • βœ… Responsive web interface

🀝 Contributing

Note: This is an unofficial community fork. For core features and major changes, please contribute to the upstream project.

For this fork specifically:

  • πŸ› Bug reports for integration issues
  • πŸ“¦ Docker-related improvements
  • πŸ“ Documentation enhancements
  • πŸ§ͺ Testing feedback

Open an issue or PR if you have improvements to share!


πŸ†˜ Support & Community


πŸ“„ License

This project is licensed under the MIT License. See LICENSE for details.

Original work Copyright Β© clusterzx
Fork maintained by Admonstrator


πŸ™ Support the Original Developer

Patreon PayPal BuyMeACoffee Ko-Fi

About

This is a patched version of Paperless-AI that has extra features, bug fixes, and QOL improvements.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • JavaScript 56.3%
  • EJS 29.4%
  • Python 9.1%
  • CSS 4.8%
  • Other 0.4%