A powerful NestJS-based backend application that enables users to upload PDF documents and interact with them using AI-powered question-answering capabilities. The system leverages Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses based on document content.
demo_ai_document.mp4
- PDF Document Processing: Upload and parse PDF documents with automatic text extraction
- AI-Powered Q&A: Ask questions about uploaded documents and receive intelligent answers
- Chats and Messages: Threaded conversations with full CRUD endpoints, chat message persistence, search, and context retention
- Asynchronous Processing: Fast, scalable document and chat message embedding using BullMQ with Redis queue backend
- Vector Search: Advanced semantic search using ChromaDB for document and message retrieval
- Multiple AI Providers: Support for both Google Gemini and Anthropic Claude models
- User Authentication: Secure JWT-based authentication system
- GCP Secret Manager Integration: Secure management of secrets and credentials for cloud and production deployments
- Real-time Streaming: Server-sent events for streaming AI responses
- Docker Support: Complete containerized deployment with Docker Compose
- Observability: Built-in OpenTelemetry integration with Zipkin tracing
The API now includes comprehensive support for threaded conversations (Chats) and chat messages (Messages) with full CRUD endpoints. All endpoints are documented with Swagger/OpenAPI.
- Chats
- Create, list, retrieve, update, and delete chat threads
- Ask questions about documents in a chat context (with streamed responses)
- Messages
- Create and manage chat messages, with semantic search and embedding
- Full message pagination and context retention
- Swagger/OpenAPI: Full API documentation is available at
/apiwhen the server is running.
The application implements a modern RAG (Retrieval-Augmented Generation) architecture:
- Document Upload - PDF files are uploaded via REST API
- Text Extraction - Content is extracted using pdf-parse
- Text Chunking - Documents are split into semantic chunks using LangChain
- Embedding Generation - Text chunks are converted to vectors using AI models
- Vector Storage - Embeddings are stored in ChromaDB for fast retrieval
- Question Processing - User questions are embedded using the same AI model
- Semantic Search - ChromaDB finds the most relevant document chunks
- Context Construction - Retrieved chunks are combined with the question
- AI Generation - AI model generates answers based on the provided context
- Streaming Response - Answers are streamed back to the client in real-time
- Framework: NestJS (Node.js)
- Runtime: Bun
- Database: PostgreSQL with TypeORM
- Vector Database: ChromaDB
- Queue & Worker: Redis + BullMQ (asynchronous tasks, message embedding)
- AI Models: Google Gemini, Anthropic Claude
- Authentication: JWT
- Secret Management: GCP Secret Manager
- Text Processing: LangChain, pdf-parse
- Containerization: Docker & Docker Compose
- Observability: OpenTelemetry, Zipkin
- Language: TypeScript
git clone <repository-url>
cd ai-document-backendCopy the example environment file and configure your settings:
cp .env.example .env.developmentEdit .env.development with your configuration:
# App
APP_ENV=development
# PostgreSQL
DB_HOST=db
DB_USER=aidocument
DB_PASS=password
DB_NAME=aidocument
# Redis
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASS=password
# AI_SERVICE_TYPE=anthropic
AI_SERVICE_TYPE=gemini
ANTHROPIC_API_KEY=your_anthropic_api_key_here
GEMINI_API_KEY=your_gemini_api_key_herebun installDevelopment Mode (with hot reload):
bun run compose:devProduction Mode:
bun run compose:prodThe application will be available at:
- API: http://localhost:3000
- ChromaDB: http://localhost:8000
- Zipkin Tracing: http://localhost:9411
# Development
bun run compose:down:dev
# Production
bun run compose:down:prod# Unit tests
bun test
# End-to-end tests
bun run test:e2e
# Test coverage
bun run test:cov# Lint code
bun run lint
# Format code
bun run format- Start PostgreSQL and ChromaDB separately
- Update environment variables to point to local services
- Run the application:
bun run start:devsrc/
βββ ai/ # AI service implementations (Gemini, Anthropic, etc)
βββ auth/ # Authentication and user management
βββ chats/ # Chat session and question handling (CRUD, SSE)
βββ documents/ # Document processing and Q&A
βββ embedding/ # Document/message embedding and vector search
βββ messages/ # Chat message CRUD, processing, BullMQ queue
βββ secret-manager/ # GCP Secret Manager integration
βββ users/ # User management
βββ shared/ # Shared utilities and interceptors
βββ app.module.ts # Main application module
βββ main.ts # Application entry point
config/ # Configuration files
doc/ # Documentation
test/ # End-to-end tests
| Variable | Description | Default |
|---|---|---|
APP_ENV |
Application environment | development |
APP_PORT |
Server port | 3000 |
DB_HOST |
PostgreSQL host | db |
DB_PORT |
PostgreSQL port | 5432 |
DB_USER |
Database username | - |
DB_PASS |
Database password | - |
DB_NAME |
Database name | - |
REDIS_HOST |
Redis host | redis |
REDIS_PORT |
Redis port | 6379 |
REDIS_PASS |
Redis password | password |
AI_SERVICE_TYPE |
AI provider (gemini or anthropic) |
gemini |
GEMINI_API_KEY |
Google Gemini API key | - |
ANTHROPIC_API_KEY |
Anthropic Claude API key | - |
The application supports multiple AI providers. Configure your preferred provider in the environment:
- Google Gemini: Set
AI_SERVICE_TYPE=geminiand provideGEMINI_API_KEY - Anthropic Claude: Set
AI_SERVICE_TYPE=anthropicand provideANTHROPIC_API_KEY
The application includes comprehensive observability features:
- Logging: Structured logging with request interceptors
- Tracing: OpenTelemetry integration with Zipkin
- Health Checks: Database and ChromaDB connectivity monitoring
Access Zipkin UI at http://localhost:9411 to view distributed traces.
Important: The latest updates add new modules (Chats, Messages), BullMQ for queue processing, and Redis as a required backend for async tasks. Existing deployments should:
- Run database migrations to add Chats and Messages tables (see
/src/chatsand/src/messages) - Ensure Redis is deployed and configured for BullMQ queues
- Review
.envfor new variables (see above) - Populate GCP Secret Manager if using for secret management
- Configure production environment variables in
.env.production - Deploy using Docker Compose:
bun run compose:prodThe application runs in a multi-container setup:
- app: NestJS application
- db: PostgreSQL database
- vector_db: ChromaDB vector database
- otel-collector: OpenTelemetry collector
- zipkin: Distributed tracing UI
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Created by yelaco
- Workflow Documentation - Detailed explanation of the RAG pipeline
- NestJS Documentation
- ChromaDB Documentation
- Google Gemini API
- Anthropic Claude API