Skip to content

An in production RAG (Retrieval-Augmented Generation) web application that leverages Google Gemini Pro for question answering using user-uploaded PDFs or CSVs. Built with FastAPI, LangChain, Pinecone, and LangSmith for robust LLMOps, and deployed using Docker, GitHub Actions, and Render.

Notifications You must be signed in to change notification settings

yashh2417/rag-assistant

Repository files navigation


🔍 Gemini RAG Assistant

A production-ready RAG (Retrieval-Augmented Generation) web application that leverages Google Gemini Pro for question answering using user-uploaded PDFs or CSVs. Built with FastAPI, LangChain, Pinecone, and LangSmith for robust LLMOps, and deployed using Docker, GitHub Actions, and Render.


🚀 Features

  • 🔮 Uses Gemini 2.0 Flash for conversational LLM responses
  • 📚 Upload PDF or CSV files to dynamically extend knowledge
  • 🧠 Context storage using Pinecone Vector Database
  • 🧩 Built with LangChain for easy chaining, Memory and RAG architecture
  • 🧪 Tracks sessions, queries, cost, latency via LangSmith
  • 🛠️ Built with FastAPI + Jinja2 templates for the web UI
  • 🔐 Prevents duplicate uploads using SHA256 content hashing
  • 📦 Automatically tested and deployed via GitHub Actions
  • 🐳 Packaged in a Docker container and deployed to Render

🖼️ UI Screenshots

  • Home Page

Home Page


## 🧱 Project Structure

.
├── app/
│   ├── main.py                                        # FastAPI backend logic
│   ├── templates/                                     # Jinja2 HTML templates
│   ├── static/                                        # Static CSS/JS/images
│   ├ ____________________├── filenames in mongodb     # Tracks uploaded filenames in mongodb
│   ├                     ├── hashes in mongodb        # Tracks file hashes to prevent duplicates in mongodb
│   └── temp_uploads/                                  # Temporary storage during processing
├── requirements.txt
├── .env                                               # API keys and secrets (not committed)
├── Dockerfile
├── .github/
│   └── workflows/
│       └── deploy.yaml                                # GitHub Actions CI/CD pipeline
├-------------------------├ Pinecone Vector DB         # Storing embeddings in Vector DB
└── README.md


🧠 LLM Stack Overview

Component Tech Used
Language Model Gemini 2.0 Flash (ChatGoogleGenerativeAI)
Embeddings GoogleGenerativeAIEmbeddings
Vector Store Pinecone
Chunking LangChain's RecursiveCharacterTextSplitter
RAG Chain ConversationalRetrievalChain
Memory ConversationBufferMemory
Tracing/Monitoring LangSmith

📊 LangSmith Integration

LangSmith is used to trace and monitor:

  • ✅ Query and response pairs
  • 🕒 Latency per interaction
  • 💰 Token usage and cost
  • 🧵 Session chat history

Make sure your .env file contains:

LANGSMITH_API_KEY=your_langsmith_key
LANGSMITH_TRACING=true

🔧 Requirements

Install all dependencies via:

pip install -r requirements.txt

📄 .env Configuration

Make sure you create a .env file in the root directory with:

GOOGLE_API_KEY=your_google_api_key
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_API_ENV=us-east-1
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_TRACING=true

🔬 Testing & CI/CD

This project uses GitHub Actions to:

  1. Run 3 test cases for app health check and upload/chat routes
  2. On successful tests, build a Docker image
  3. Push the image to DockerHub automatically

Example workflow file: .github/workflows/deploy.yaml


🐳 Docker Build & Deployment

Build Locally

docker build -t gemini-rag-assistant .
docker run -p 8000:8000 gemini-rag-assistant

Automatic Deployment

After CI tests pass:

  • Docker image is built using requirements.txt
  • Image is pushed to DockerHub
  • Deploy on Render.com with correct API keys injected via Render Dashboard

🌐 Live Deployment

direct link : https://rag-assistant-diwc.onrender.com/


🧪 Example Chat Session

  • Live Conversation

Live convo


🛡️ File Deduplication

Files are hashed using SHA256 to avoid re-uploading and re-indexing the same file with different names.


🔂 LangSmith Tracing

  • Language logs

Language logs


📦 Future Enhancements

  • ✅ Add authentication
  • 🔍 Support other formats like DOCX
  • 📈 Usage analytics dashboard
  • 💬 WebSocket support for live chat streaming

📹 Reference Video

  • Youtube Video

Rag-Assistant-Video


🧑‍💻 Maintainer


📝 License

MIT License – feel free to use and modify.


About

An in production RAG (Retrieval-Augmented Generation) web application that leverages Google Gemini Pro for question answering using user-uploaded PDFs or CSVs. Built with FastAPI, LangChain, Pinecone, and LangSmith for robust LLMOps, and deployed using Docker, GitHub Actions, and Render.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published