Skip to content

🤖 AI-powered document management system with RESTful API and web frontend for intelligent document upload, analysis, and processing. Features OpenAI integration, multi-format support, background jobs, and S3 storage.

License

Notifications You must be signed in to change notification settings

michael-abdo/rapidocsai

Repository files navigation

RapidocsAI Logo

RapidocsAI

Python 3.8+ FastAPI License: MIT CI

PostgreSQL Redis OpenAI Heroku

Code Style Security Documentation Contributions Welcome

AI-powered document management system with RESTful API and web frontend for document upload, analysis, and management.

Features

  • 🔐 Secure Authentication - JWT-based user authentication
  • 📄 Multi-format Support - PDF, DOCX, CSV, Excel, JSON, TXT
  • 🤖 AI-Powered Processing - OpenAI integration for document analysis
  • 📦 Smart Fragmentation - Automatic chunking for large documents
  • ☁️ Cloud Storage - S3 integration for scalable file storage
  • 🔄 Background Processing - Asynchronous job queue with Redis
  • 📊 Admin Dashboard - Web interface for system management
  • 🏷️ Template System - Tag and organize documents by templates

Project Structure

📁 Where Files Go

IMPORTANT: Always create files in the correct directory:

  • /logs/ - ALL log files (*.log)
  • /data/ - Database files (*.db)
  • /docs/ - Documentation files (*.md, *.html)
  • /tests/ - Test files (test_*.py)
  • /uploads/ - User uploaded documents
  • /app/ - Application source code
  • /alembic/ - Database migrations

📂 Directory Structure

rapidoc_021891240361152688586/
├── app/                    # Main application code
│   ├── core/              # Core configuration
│   ├── models/            # Database models
│   ├── routes/            # API endpoints
│   ├── schemas/           # Request/response schemas
│   ├── services/          # Business logic
│   ├── utils/             # Utility functions
│   ├── frontend/          # Web interface
│   │   ├── templates/     # HTML templates
│   │   └── static/        # CSS/JS files
│   ├── scripts/           # Utility scripts
│   └── workers/           # Background workers
├── alembic/               # Database migrations
│   └── versions/          # Migration files
├── data/                  # Database files
├── docs/                  # Documentation
├── logs/                  # Log files
├── tests/                 # Test files
├── uploads/               # User uploads
└── venv/                  # Virtual environment

Installation

Prerequisites

  • Python 3.8 or higher
  • PostgreSQL (for production)
  • Redis (for background jobs)
  • AWS S3 account (optional, for cloud storage)

🚀 Quick Start

For detailed setup instructions, see our Development Setup Guide.

  1. Setup Environment

    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
  2. Run Development Server

    ./run_dev.sh                    # API only
    ./run_dev_with_worker.sh        # API + Worker (recommended)
  3. Access Application

📋 Essential Files (Root Directory)

  • requirements.txt - Python dependencies
  • sample.env - Environment variables template
  • run_dev.sh - Development server script
  • run_dev_with_worker.sh - Server + worker script
  • run_worker.sh - Worker process script
  • Procfile - Heroku deployment
  • alembic.ini - Database migration config

🔧 Configuration

  1. Copy sample.env to .env
  2. Set required environment variables:
    • DATABASE_URL - Database connection
    • JWT_SECRET_KEY - Authentication secret
    • OPENAI_API_KEY - OpenAI API key
    • USE_S3_STORAGE - Enable S3 storage (true/false)
    • S3_BUCKET_NAME - S3 bucket name
    • AWS_ACCESS_KEY_ID - AWS access key
    • AWS_SECRET_ACCESS_KEY - AWS secret key

🚢 Deployment

Deploy to Heroku:

heroku create your-app-name
heroku addons:create heroku-postgresql:mini
heroku addons:create heroku-redis:hobby-dev
git push heroku main
heroku run python -m alembic upgrade head
heroku ps:scale web=1 worker=1

📝 Key Features

  • User authentication (JWT)
  • Document upload/download
  • Multiple file formats (PDF, DOCX, CSV, Excel, JSON, TXT)
  • Document fragmentation for large files
  • Template tagging system
  • Background job processing
  • S3 storage support
  • Admin dashboard

🧪 Testing

Run tests:

python -m pytest tests/

📚 Additional Information

For detailed documentation, see the /docs/ directory:

  • API_DOCUMENTATION.md - API reference
  • DEPLOYMENT_GUIDE.md - Deployment instructions
  • WORKER_GUIDE.md - Background worker details

🛠️ Development

Code Style

  • Follow PEP 8 guidelines
  • Use type hints where appropriate
  • Write docstrings for all functions and classes
  • Keep functions small and focused

Project Conventions

  • Always create log files in /logs/
  • Database files go in /data/
  • Documentation goes in /docs/
  • Test files go in /tests/
  • User uploads go in /uploads/

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions or issues:

About

🤖 AI-powered document management system with RESTful API and web frontend for intelligent document upload, analysis, and processing. Features OpenAI integration, multi-format support, background jobs, and S3 storage.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published