Multi-Tenant RAG Platform with Async Operations
π Production-ready, multi-tenant RAG orchestration platform with background task processing
LangOrch is a multi-tenant SaaS platform for Retrieval-Augmented Generation (RAG) with enterprise-grade features:
- β Production-Ready v0.3.0: Async operations, smart caching, timeout-free processing
- π’ Multi-Tenant Architecture: Complete data isolation per tenant
- π€ Multi-Provider LLM: OpenAI, Anthropic, Ollama support via LiteLLM
- π Vector Search: Qdrant integration for semantic document search
- π Enterprise Security: HashiCorp Vault, JWT auth, tenant isolation
- β‘ Background Processing: No timeouts on long-running operations (10+ minutes)
- Summarize: Generate concise document summaries with smart caching
- Ask: Question-answering with RAG (vector search + LLM)
- Transform: Document transformation (translate, format, extract, etc.)
- Async Background Tasks: All LLM operations run in background with polling
- Smart Summary Caching: Reuse existing summaries, optional force regeneration
- Multi-Provider Embedding: OpenAI, Google Gemini, Anthropic Claude, Ollama
- Dynamic Embedding Dimensions: Support for different embedding models
- Tenant Configuration: Per-tenant LLM and embedding provider settings
- Document Management: Upload, process, chunk, and embed PDF/DOCX files
- FastAPI - High-performance async web framework
- LiteLLM - Unified LLM API (OpenAI, Anthropic, Ollama)
- PostgreSQL 16+ - Primary database
- Qdrant - Vector database for semantic search
- Redis 7+ - Caching and session management
- HashiCorp Vault - Secure secret management
- SQLAlchemy + Alembic - ORM and migrations
- Pydantic - Data validation
- structlog - Structured logging
- Next.js 14 (App Router)
- React with TypeScript
- shadcn/ui + TailwindCSS
- Axios - API client
- Sonner - Toast notifications
- Docker & Docker Compose
- Nginx (optional reverse proxy)
- Docker & Docker Compose
- Python 3.11+
- Node.js 18+
- Git
# 1. Clone the repository
git clone <repository-url>
cd langorch
# 2. Create environment file
cp .env.example .env
# Edit .env with your settings
# 3. Start infrastructure services
docker-compose up -d
# 4. Backend setup
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Run database migrations
alembic upgrade head
# Start backend
uvicorn app.main:app --reload
# 5. Frontend setup (new terminal)
cd frontend
npm install
npm run dev- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Vault UI: http://localhost:8200 (Token: dev-root-token)
- Qdrant Dashboard: http://localhost:6333/dashboard
| Version | Status | Description | Release Date |
|---|---|---|---|
| v0.3.0 | β Released | Async RAG operations with smart caching | 2026-01-08 |
| v0.4.0 | π§ In Development | LangGraph multi-agent workflows, streaming | Q1 2026 |
| v1.0.0 | π Planned | Production-ready, full observability | Q2 2026 |
What's New:
- Background task processing for all LLM operations (Summarize, Ask, Transform)
- Smart summary caching with force regeneration option
- Extended timeout support (10 minutes) for long operations
- Multi-provider embedding support (OpenAI, Gemini, Claude, Ollama)
- Dynamic embedding dimensions
- Latest summary retrieval endpoint
- Improved error handling and logging
Bug Fixes:
- Fixed transform operation timeout issue
- Fixed duplicate LLM operation records
- Improved polling mechanism
Planned Features:
- LangGraph integration for multi-agent workflows
- LangSmith observability and monitoring
- Streaming responses via Server-Sent Events (SSE)
- Advanced RAG: reranking, hybrid search, multi-query
- Conversation history and memory
- Agent-based architecture
Target Features:
- Complete observability stack (Prometheus, Grafana, LangSmith)
- Kubernetes deployment manifests
- Production-grade monitoring and alerting
- Performance optimizations
- Comprehensive documentation
- Security audit and hardening
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FRONTEND LAYER β
β Next.js 14 + shadcn/ui β
β (Document UI, RAG Operations, Settings) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BACKEND LAYER β
β FastAPI + Background Tasks β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Tenant Isolation (JWT + Middleware) β β
β β βββ Auth Service (JWT, Password Hashing) β β
β β βββ Document Service (Upload, Processing) β β
β β βββ Embedding Service (Multi-provider) β β
β β βββ RAG Service (Summarize, Ask, Transform) β β
β β βββ LLM Service (LiteLLM Integration) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β β β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β PostgreSQL β β Redis β β Qdrant β
β (Main DB) β β (Sessions) β β (Vectors) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β
β
ββββββββββββββββ
β Vault β
β (Secrets) β
ββββββββββββββββ
- JWT-based authentication
- Tenant-scoped database queries
- API-level tenant filtering
- Session isolation via Redis
- HashiCorp Vault for API keys
- Tenant-specific secret storage
- No secrets in code or .env files
- Automatic secret rotation support
- Encrypted connections (TLS/SSL ready)
- Secure password hashing (pwdlib with Argon2)
- Audit logging for critical operations
- GDPR-compliant data handling
POST /api/v1/llm/documents/summarize
{
"document_id": "uuid",
"model": "llama3.2", # optional
"max_length": 500, # optional
"force": false # optional
}POST /api/v1/llm/documents/ask
{
"document_id": "uuid",
"question": "What is this document about?",
"model": "llama3.2", # optional
"max_chunks": 5 # optional
}POST /api/v1/llm/documents/transform
{
"document_id": "uuid",
"instruction": "Translate to Turkish",
"model": "llama3.2", # optional
"output_format": "text" # text, markdown, json
}All operations return immediately with an operation_id. Use polling to check status:
GET /api/v1/llm/operations/{operation_id}# Backend tests
cd backend
pytest tests/ -v --cov=app
# Frontend tests
cd frontend
npm run test
npm run type-check
# Linting
black backend/app
isort backend/app
flake8 backend/applangorch/
βββ backend/
β βββ app/
β β βββ main.py # FastAPI application
β β βββ api/
β β β βββ v1/endpoints/ # API endpoints
β β βββ core/ # Config, database, vault
β β βββ models/ # SQLAlchemy models
β β βββ schemas/ # Pydantic schemas
β β βββ services/ # Business logic
β βββ alembic/ # Database migrations
β βββ requirements.txt
βββ frontend/
β βββ app/ # Next.js app router
β βββ components/ # React components
β βββ lib/ # API client, utilities
β βββ package.json
βββ docs/ # Documentation
βββ .github/ # GitHub workflows
βββ docker-compose.yml
βββ VERSION # Current version
βββ CHANGELOG.md # Version history
βββ README.md
We follow Conventional Commits:
feat: add new feature
fix: bug fix
docs: documentation changes
refactor: code refactoring
test: adding or updating tests
chore: maintenance tasksSee Branching Strategy for details.
# Create feature branch
git checkout develop/v0.4
git checkout -b feature/my-feature
# Commit changes
git add .
git commit -m "feat: add amazing feature"
# Push and create PR
git push origin feature/my-feature[License information to be added]
[Contact information to be added]
Built with these amazing open-source projects:
Current Status: v0.3.0 - Production ready for basic RAG operations
Next Up: v0.4.0 - LangGraph integration and streaming responses
For detailed development information, see Development Phases