A curated collection of production-oriented Generative AI projects spanning RAG, LLM fine-tuning, conversational AI, and agentic systems.
Each project lives in its own folder with a dedicated README covering architecture, implementation choices, and trade-offs. The main goals across this portfolio: production deployment, hybrid (deterministic + LLM) design, and honest documentation of what worked and what didnβt.
A production-oriented customer support system that combines a deterministic ML intent router (DeBERTa) with LangGraph-orchestrated workflows and bounded LLM reasoning (Groq). Designed to solve a real production tension: balancing LLM flexibility with deterministic control.
Highlights
- Hybrid architecture: DeBERTa for intent routing, Groq LLM scoped to bounded reasoning only (Inquiry Agent), LangGraph for traceable workflow execution
- Three execution paths: Inquiry Agent (LLM + tools), Complaint Workflow (ticket creation), Retention Workflow (churn mitigation)
- 89% intent classification accuracy, sub-800ms p95 latency
- Documented design decision: explicitly rejected a pure-LLM orchestration approach after observing tool-selection loops and inconsistent workflow behavior
- Deployed via AWS Lambda + API Gateway + ECR (containerized)
π Full Documentation
A LangGraph-powered autonomous agent that performs real-time web research using Groq LLM and Tavily Search, with session-based memory. (Named after Atlas the Titan: holding up a world of real-time information.)
Highlights
- Dynamic tool orchestration: the agent decides when to respond directly vs invoke external tools
- Session memory via LangGraph
MemorySaverfor contextual multi-turn conversations - Modular separation of agent definition from execution via a
run_agentinterface - Deployed serverlessly on AWS Lambda + API Gateway, with CORS handled by routing requests through a Django backend proxy
- Containerized with Docker
π Full Documentation
A two-phase fine-tuning project for multi-domain conversational AI, demonstrating end-to-end LLM customization from intent classification through natural response generation.
Phase I: Intent Classification Foundation
- Fine-tuned RoBERTa-large with LoRA to classify ~150 user intents across multiple domains
- Built scalable training and deployment pipelines on AWS SageMaker + Hugging Face Trainer
Phase II: Natural Language Generation for Core Intents
- Reduced 150 raw intents β 20 core intents via a custom label-mapping wrapper
- Fine-tuned FLAN-T5 to generate natural responses for 10 core intents
- Cost-efficient serverless inference via AWS Lambda + API Gateway
Honest limitation surfaced: FLAN-T5 handles single-turn queries fluently but does not reliably manage structured multi-turn dialogues requiring slot filling. This motivated the Rasa project below.
π Full Documentation
A conversational Q&A system over uploaded documents (PDF, DOCX, TXT, and more) built on a containerized, production-deployed RAG pipeline.
Highlights
- Document parsing and embedding via LangChain + Apache Tika
- Multi-format support: PDF, DOCX, TXT, and additional file types
- Operational guardrails: duplicate detection on upload, 5-file session cap, daily cleanup cron for storage hygiene
- Modular Dockerized deployment package
- Deployed on AWS for production usage
π Project Folder (detailed README in progress)
A deterministic, slot-filling chatbot for end-to-end movie booking, directly addressing the multi-turn dialogue limitation identified in INTELLA Phase II.
Highlights
- Multi-turn dialogue collecting ZIP code, movie, showtime, theater, and seat selection
- Form validation and business-rule enforcement (e.g. one seat per show, no past or ongoing dates)
- Gracenote API integration for real-time movie listings, theaters, and showtimes
- Automated HTML email confirmations via Pythonβs
smtplib - Docker-based training and deployment, handling MacOS β Linux model compatibility issues
π Full Documentation
NLP-based text summarization project. (Documentation in progress.)
π Project Folder
GenAI/
βββ Agentic Tool-Enabled Web Assistant (ATLAS)/ # LangGraph + Groq + Tavily research agent
βββ Agentic User Resolution Assistant (AURA)/ # Hybrid production support system
βββ Intent Classification with LoRA-Fine-Tuned # RoBERTa + LoRA + FLAN-T5 fine-tuning
β Language Assistant(INTELLA)/
βββ IntelliQA/ # RAG-based document Q&A
βββ TextSummarization/ # NLP text summarization (WIP)
βββ rasa/ # Deterministic dialogue management
βββ README.md # You are here
| Area | Tools & Techniques |
|---|---|
| LLMs & Fine-Tuning | RoBERTa, DeBERTa, FLAN-T5, LoRA / PEFT, Hugging Face Transformers |
| Agentic Systems | LangGraph, LangChain, Groq LLM, Tavily Search, MemorySaver checkpointing |
| RAG | LangChain, Apache Tika, vector retrieval, multi-format document parsing |
| Dialogue Management | Rasa (slot filling, form validation, business-rule enforcement) |
| Cloud & Deployment | AWS SageMaker, Lambda, API Gateway, ECR, Docker, serverless inference |
| Backend & Data | Python, Django, Supabase |
| System Design | Hybrid ML + LLM architectures, deterministic-first routing, graph-based orchestration |
I build practical GenAI systems with a focus on production deployment, hybrid architectures (deterministic + LLM), and transparent trade-off documentation. This portfolio reflects a progression from foundational RAG and fine-tuning toward agentic and hybrid system design.