A production-ready RAG (Retrieval-Augmented Generation) web application that leverages Google Gemini Pro for question answering using user-uploaded PDFs or CSVs. Built with FastAPI, LangChain, Pinecone, and LangSmith for robust LLMOps, and deployed using Docker, GitHub Actions, and Render.
- 🔮 Uses Gemini 2.0 Flash for conversational LLM responses
- 📚 Upload PDF or CSV files to dynamically extend knowledge
- 🧠 Context storage using Pinecone Vector Database
- 🧩 Built with LangChain for easy chaining, Memory and RAG architecture
- 🧪 Tracks sessions, queries, cost, latency via LangSmith
- 🛠️ Built with FastAPI + Jinja2 templates for the web UI
- 🔐 Prevents duplicate uploads using SHA256 content hashing
- 📦 Automatically tested and deployed via GitHub Actions
- 🐳 Packaged in a Docker container and deployed to Render
- Home Page
## 🧱 Project Structure
.
├── app/
│ ├── main.py # FastAPI backend logic
│ ├── templates/ # Jinja2 HTML templates
│ ├── static/ # Static CSS/JS/images
│ ├ ____________________├── filenames in mongodb # Tracks uploaded filenames in mongodb
│ ├ ├── hashes in mongodb # Tracks file hashes to prevent duplicates in mongodb
│ └── temp_uploads/ # Temporary storage during processing
├── requirements.txt
├── .env # API keys and secrets (not committed)
├── Dockerfile
├── .github/
│ └── workflows/
│ └── deploy.yaml # GitHub Actions CI/CD pipeline
├-------------------------├ Pinecone Vector DB # Storing embeddings in Vector DB
└── README.md
Component | Tech Used |
---|---|
Language Model | Gemini 2.0 Flash (ChatGoogleGenerativeAI ) |
Embeddings | GoogleGenerativeAIEmbeddings |
Vector Store | Pinecone |
Chunking | LangChain's RecursiveCharacterTextSplitter |
RAG Chain | ConversationalRetrievalChain |
Memory | ConversationBufferMemory |
Tracing/Monitoring | LangSmith |
LangSmith is used to trace and monitor:
- ✅ Query and response pairs
- 🕒 Latency per interaction
- 💰 Token usage and cost
- 🧵 Session chat history
Make sure your .env
file contains:
LANGSMITH_API_KEY=your_langsmith_key
LANGSMITH_TRACING=true
Install all dependencies via:
pip install -r requirements.txt
Make sure you create a .env
file in the root directory with:
GOOGLE_API_KEY=your_google_api_key
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_API_ENV=us-east-1
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_TRACING=true
This project uses GitHub Actions to:
- Run 3 test cases for app health check and upload/chat routes
- On successful tests, build a Docker image
- Push the image to DockerHub automatically
Example workflow file: .github/workflows/deploy.yaml
docker build -t gemini-rag-assistant .
docker run -p 8000:8000 gemini-rag-assistant
After CI tests pass:
- Docker image is built using
requirements.txt
- Image is pushed to DockerHub
- Deploy on Render.com with correct API keys injected via Render Dashboard
direct link : https://rag-assistant-diwc.onrender.com/
- Live Conversation
Files are hashed using SHA256 to avoid re-uploading and re-indexing the same file with different names.
- Language logs
- ✅ Add authentication
- 🔍 Support other formats like DOCX
- 📈 Usage analytics dashboard
- 💬 WebSocket support for live chat streaming
- Youtube Video
- Name: Yash
- GitHub: yashh2417
- LinkedIn: yashh2417
- Email: yashh2417@gmail.com
MIT License – feel free to use and modify.