A production-grade RAG application that indexes GitHub repositories and enables natural language Q&A with streaming LLM responses.
Paste any public GitHub URL → CodeMind clones and indexes the entire codebase using vector embeddings → Chat with it using natural language and get source-cited, streaming answers powered by Claude AI.
Example questions you can ask:
- "How does authentication work in this repo?"
- "Find all places where database transactions are used"
- "Are there any potential memory leaks in this code?"
- "Generate documentation for the
processPaymentfunction"
User Browser
↓
React (TypeScript + TailwindCSS + Shadcn/UI)
↓ REST API + SSE (streaming)
NestJS Backend (TypeScript)
├──→ PostgreSQL + pgvector (embeddings store)
├──→ Redis (rate limiting + embedding cache)
├──→ OpenAI API (text-embedding-3-small)
└──→ Anthropic Claude API (claude-3-5-haiku streaming)
- User types a question in the chat UI
- Frontend sends
POST /api/chat/conversations/:id/messages - NestJS embeds the question → 1536-dim vector via OpenAI
- pgvector cosine similarity search → top-5 relevant code chunks
- Claude receives: system prompt + code context + conversation history + question
- Tokens stream back via SSE → frontend renders in real-time
- Response + source citations saved to PostgreSQL
| Category | Technology | Purpose |
|---|---|---|
| Backend | NestJS + TypeScript | REST API + SSE streaming |
| Database | PostgreSQL + pgvector | Stores code embeddings |
| Cache | Redis | Rate limiting + embedding cache |
| Embeddings | OpenAI text-embedding-3-small | Semantic code search |
| LLM | Anthropic Claude Haiku | Chat responses |
| Frontend | React + Vite + TailwindCSS | UI |
| State | Zustand | Client state management |
| ORM | Prisma | Database client |
| Auth | JWT (access + refresh tokens) | Multi-tenant auth |
| Containers | Docker + Docker Compose | Local dev |
| Deploy | AWS ECS Fargate | Production |
| CI/CD | GitHub Actions | Auto-deploy on push |
- RAG-powered Q&A — Vector similarity search returns the most relevant code chunks for each question
- Streaming responses — Token-by-token streaming via SSE, just like ChatGPT
- Source citations — Every answer links back to exact file:line in the codebase
- Auto Bug Finder — Agentic workflow that scans files for bugs, vulnerabilities, and code smells
- Ingestion progress — Real-time SSE stream shows indexing progress (clone → chunk → embed)
- Multi-tenant — JWT auth with per-user repositories and conversations
- Rate limiting — Redis-backed throttling (20 requests/minute per user)
- Embedding cache — Identical queries skip the OpenAI API call (Redis, 1hr TTL)
2. Landing Page — Features & Tech Stack

5. Repository Ingestion — In Progress

6. Repository Ready — 210 Chunks Indexed

8. Docker Desktop — All Containers Running

codemind/
├── apps/
│ ├── backend/ NestJS API
│ │ ├── src/
│ │ │ ├── auth/ JWT auth (register, login, refresh)
│ │ │ ├── users/ User management
│ │ │ ├── repositories/ GitHub ingestion pipeline
│ │ │ ├── embeddings/ OpenAI embedding service
│ │ │ ├── chat/ RAG + Claude streaming
│ │ │ ├── agent/ Auto bug finder
│ │ │ └── prisma/ Database client
│ │ └── prisma/
│ │ └── schema.prisma
│ └── frontend/ React + Vite app
│ └── src/
│ ├── components/ UI components (chat, repo, auth)
│ ├── hooks/ useChat, useAuth, useRepositories
│ ├── pages/ Landing, Dashboard, Chat pages
│ └── store/ Zustand auth store
├── docker-compose.yml Local dev (postgres + redis)
├── docker-compose.prod.yml Production config
├── .github/workflows/ GitHub Actions CI/CD
└── setup.sh One-command setup script
- Docker Desktop
- Node.js 20+
- OpenAI API key — platform.openai.com
- Anthropic API key — console.anthropic.com
git clone https://github.com/yourusername/codemind
cd codemind
# Copy env template
cp apps/backend/.env.example apps/backend/.envEdit apps/backend/.env:
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/codemind"
REDIS_URL="redis://localhost:6379"
JWT_SECRET="your-secret-min-32-chars"
JWT_REFRESH_SECRET="your-refresh-secret-min-32-chars"
OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-ant-..."
PORT=3001
FRONTEND_URL="http://localhost:5173"docker-compose up -d postgres rediscd apps/backend
npm install
npx prisma migrate dev --name init
npx prisma generatenpm run start:dev
# Running at http://localhost:3001cd apps/frontend
npm install
npm run dev
# Running at http://localhost:5173Alternatively, run
./setup.shfrom the project root for automated setup.
POST /api/auth/register { email, password, name } → { accessToken, refreshToken, user }
POST /api/auth/login { email, password } → { accessToken, refreshToken, user }
POST /api/auth/refresh { refreshToken } → { accessToken, refreshToken }
GET /api/auth/me (JWT) → { user }
GET /api/repositories → list repos
POST /api/repositories { githubUrl } → create + start ingestion
GET /api/repositories/:id → repo details
DELETE /api/repositories/:id → delete repo + all chunks
GET /api/repositories/:id/status (SSE) → ingestion progress stream
GET /api/chat/conversations?repositoryId=... → list conversations
POST /api/chat/conversations { repositoryId } → create
GET /api/chat/conversations/:id → conversation + messages
POST /api/chat/conversations/:id/messages (SSE stream) → send message
POST /api/agent/analyze/:repoId (SSE stream) → run auto bug finder
- ECS Fargate — Backend container (no EC2 servers to manage)
- RDS PostgreSQL — Managed DB with pgvector extension
- ElastiCache Redis — Managed Redis
- ECR — Docker image registry
- ALB — HTTPS + load balancing
- Route 53 — Custom domain
Push to main → GitHub Actions automatically:
- Runs tests
- Builds Docker images
- Pushes to ECR
- Runs
prisma migrate deploy - Updates ECS services (zero-downtime rolling deploy)
Required GitHub Secrets:
| Secret | Description |
|---|---|
AWS_ACCESS_KEY_ID |
IAM user access key |
AWS_SECRET_ACCESS_KEY |
IAM user secret key |
AWS_ACCOUNT_ID |
AWS account ID |
SUBNET_IDS |
VPC subnet IDs for ECS tasks |
SECURITY_GROUP_ID |
Security group for ECS tasks |
| Operation | Cost |
|---|---|
| Index medium repo (~500 chunks) | ~$0.02 (OpenAI embeddings) |
| Chat message (5 context chunks) | ~$0.005 (Claude Haiku) |
| Bug finder scan (20 files) | ~$0.10 (Claude Haiku) |
| Total for building + testing | ~$5–10 |
Chunking strategy: 50-line chunks with 10-line overlap. Chunks are small enough for focused embeddings but large enough to preserve context. File path metadata is included so the model can cite exact sources.
pgvector over dedicated vector DB: Keeps the stack simple — one less service to manage. pgvector with HNSW indexing handles millions of vectors with millisecond similarity search.
SSE over WebSockets: SSE is unidirectional and HTTP-native — no upgrade handshake, works through load balancers without config changes, simpler to implement in NestJS.
Redis embedding cache: Identical questions (e.g., "what does this function do?" asked repeatedly) skip the OpenAI API call entirely. 1hr TTL balances freshness vs. cost.
- Fork the repo
- Create a branch:
git checkout -b feature/your-feature - Make your changes and add tests
- Run tests:
cd apps/backend && npm test - Open a pull request
MIT
Built as a portfolio project for 2026 AI engineering job applications.



