DocChat: Langchain Retrieval System

This Streamlit application implements a Langchain-based retrieval system for processing PDF documents and conducting conversational retrieval using Langchain's capabilities.

Read More: here

RAG Architecture

Streamlit UI

Overview

DocChat is a Langchain-based retrieval system that processes PDF documents and creates a conversational retrieval chain. It leverages multiple technologies to extract text, generate embeddings, and enable chat-based querying over processed content.

Tech Stack

FastAPI – Serves as the backend API for processing PDFs and handling chat requests.
Streamlit – Provides the frontend user interface for uploading PDFs and interacting with the conversational system.
Langchain – A core library for NLP tasks such as text splitting and conversational retrieval.
Google Palm & Google Generative Language – Used for generating embeddings.
FAISS – Facebook AI Similarity Search used for efficient similarity search over embeddings.
PyMuPDF (fitz) – Extracts text from PDFs.
Docker & Docker Compose – Containerizes and orchestrates the backend and frontend applications.
Python-dotenv – Loads environment variables (e.g. API keys) from a .env file.

Project Setup

Prerequisites

Python Environment: Python 3.x is required.
Environment Variables: Create a .env file in the project root with the content:
```
GOOGLE_API_KEY=your_google_api_key_here
```
Replace your_google_api_key_here with your actual Google API key.

Installation (Local Setup)

Clone the Repository:

git clone https://github.com/Varunv003/langchain-palm2-rag_application

Set Up Virtual Environment:

python -m venv venv
# On Windows:
.\venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

Install Dependencies:
```
pip install -r requirements.txt
```
Initialize Folder Structure (if needed):
```
python template.py
```
Running the Streamlit App (Frontend):
```
streamlit run app.py
```
The application will be available at http://localhost:8501.
Running the FastAPI App (Backend):
```
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
The backend API will be available at http://localhost:8000.

Running with Docker

This project includes Dockerfiles for both the FastAPI backend and the Streamlit frontend. Docker Compose is used to orchestrate both services.

Building and Running via Docker Compose

Ensure Docker Desktop is Running.
From the project root (where the docker-compose.yml is located), run:
```
docker-compose up --build
```
Services:
- Backend (FastAPI) will be available at http://localhost:8000.
- Frontend (Streamlit) will be available at http://localhost:8501.

Docker Compose File Structure

Your docker-compose.yml defines two services:

backend: Built using Dockerfile.backend and exposing port 8000.
frontend: Built using Dockerfile.frontend and exposing port 8501, with a dependency on the backend service.

Usage

Upload PDFs: Use the sidebar in the Streamlit interface to upload PDF files.
Process Documents: Click "Submit and Process" to extract text, generate embeddings, and initialize the conversational chain.
Chat: Ask questions related to the processed PDFs through the chat interface. The backend retrieves and forms responses using the Langchain conversational chain.

Future Improvements

Enhance error handling and user feedback.
Optimize scalability and performance for larger documents.
Integrate additional AI models or refine existing conversational models for improved responses.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
research		research
src		src
.gitignore		.gitignore
Dockerfile.backend		Dockerfile.backend
Dockerfile.frontend		Dockerfile.frontend
LICENSE		LICENSE
README.md		README.md
Rag_Architecture.jpg		Rag_Architecture.jpg
app.py		app.py
docker-compose.yml		docker-compose.yml
main.py		main.py
readme_img_1.png		readme_img_1.png
readme_img_2.png		readme_img_2.png
requirements.txt		requirements.txt
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DocChat: Langchain Retrieval System

RAG Architecture

Streamlit UI

Overview

Tech Stack

Project Setup

Prerequisites

Installation (Local Setup)

Running with Docker

Building and Running via Docker Compose

Docker Compose File Structure

Usage

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Varunv003/langchain-palm2-rag_application

Folders and files

Latest commit

History

Repository files navigation

DocChat: Langchain Retrieval System

RAG Architecture

Streamlit UI

Overview

Tech Stack

Project Setup

Prerequisites

Installation (Local Setup)

Running with Docker

Building and Running via Docker Compose

Docker Compose File Structure

Usage

Future Improvements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages