RAG Pipeline API with Conversation Memory

A production-ready Retrieval-Augmented Generation (RAG) system with conversation memory, built with LangChain, FastAPI, and modern MLOps practices.

🌟 Features

Core Features

Multi-Provider LLM Support: OpenAI, Google Gemini, Groq, and Ollama
Advanced Retrieval: Hybrid search (vector + BM25) with reranking
Conversation Memory: SQLite-based persistent conversation history
Document Processing: Support for TXT and PDF files
Semantic & Recursive Chunking: Flexible text splitting strategies
RESTful API: FastAPI with automatic OpenAPI documentation

Production Features

✅ Comprehensive Logging: Structured logging with rotation
✅ Error Handling: Graceful error handling and recovery
✅ Health Checks: Kubernetes-ready health endpoints
✅ Rate Limiting: API rate limiting to prevent abuse
✅ CORS Support: Configurable CORS for web applications
✅ Docker Support: Multi-stage builds with health checks
✅ CI/CD Pipeline: Automated testing, security scanning, and deployment
✅ Configuration Management: Environment-based configuration

🏗️ Architecture

┌─────────────────┐
│   User Query    │
└────────┬────────┘
         │
         v
┌─────────────────┐
│  FastAPI Server │
│  - Rate Limit   │
│  - CORS         │
│  - Logging      │
└────────┬────────┘
         │
         v
┌─────────────────────────────┐
│   Conversation Memory       │
│   (SQLite)                  │
│   - Session Management      │
│   - History Retrieval       │
└─────────────┬───────────────┘
              │
              v
┌─────────────────────────────┐
│   Document Retrieval        │
│   ┌──────────────────────┐  │
│   │ Vector Search (Chroma)│  │
│   │ + BM25 (Keyword)      │  │
│   └──────────────────────┘  │
│   ┌──────────────────────┐  │
│   │ Ensemble + Reranking │  │
│   └──────────────────────┘  │
└─────────────┬───────────────┘
              │
              v
┌─────────────────────────────┐
│   LLM (Gemini/OpenAI/etc)   │
│   - Context + History       │
│   - Answer Generation       │
└─────────────┬───────────────┘
              │
              v
┌─────────────────────────────┐
│   Response + Memory Update  │
└─────────────────────────────┘

📦 Installation

Prerequisites

Python 3.11+
Docker & Docker Compose (optional)
API keys for your chosen LLM provider

Local Setup

Clone the repository

git clone https://github.com/gokhaneraslan/advanced_rag.git
cd advanced_rag

Create virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Configure environment variables

cp .env.example .env
# Edit .env with your API keys and configuration

Create necessary directories

mkdir -p data logs vector_store

Docker Setup

# Build and run with Docker Compose
docker-compose up -d

# Check logs
docker-compose logs -f

# Stop
docker-compose down

🚀 Usage

Starting the Server

Local:

python app.py

Docker:

docker-compose up

The API will be available at http://localhost:8000

API Endpoints

1. Health Check

curl http://localhost:8000/health

2. Create a Session

curl -X POST http://localhost:8000/session/create

Response:

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Session created successfully"
}

3. Query the RAG System

curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the main objective?",
    "session_id": "550e8400-e29b-41d4-a716-446655440000"
  }'

Response:

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "input_query": "What is the main objective?",
  "answer": "The main objective is...",
  "context": [...],
  "message_count": 2
}

4. Add Documents

curl -X POST http://localhost:8000/add-documents \
  -F "files=@document1.pdf" \
  -F "files=@document2.txt"

5. Clear Session

curl -X DELETE http://localhost:8000/session/{session_id}

6. Get Session Info

curl http://localhost:8000/session/{session_id}/info

7. List All Sessions

curl http://localhost:8000/sessions

Interactive API Documentation

Visit http://localhost:8000/docs for Swagger UI documentation.

🧪 Testing

Run All Tests

pytest -v

Run with Coverage

pytest --cov=src --cov-report=html

Run Specific Test

pytest tests/test_integration.py::test_memory_system -v

Docker Integration Test

# Automated script (recommended)
./scripts/test_docker.sh

# Or use Makefile
make docker-test

# Manual Docker test
make docker-build
make docker-up
curl http://localhost:8000/health
make docker-down

Run Integration Test Script

python tests/test.py

⚙️ Configuration

All configuration is managed through environment variables. See .env.example for all available options.

Key Configuration Options

Variable	Default	Description
`LLM_PROVIDER`	`gemini`	LLM provider (openai/gemini/groq/ollama)
`LLM_MODEL`	`gemini-2.5-flash`	Model name
`MAX_MEMORY_MESSAGES`	`10`	Max messages per session
`RETRIEVAL_TOP_K`	`5`	Documents to retrieve
`RERANKER_TOP_N`	`3`	Documents after reranking
`SPLITTING_METHOD`	`semantic`	Text splitting method
`LOG_LEVEL`	`INFO`	Logging level

📊 MLOps & CI/CD

CI/CD Pipeline

The project includes a comprehensive GitHub Actions pipeline:

Code Quality: Black, isort, flake8, pylint
Security Scanning: Bandit, Safety
Testing: pytest with coverage
Docker Build: Multi-stage builds
Integration Tests: Docker-based E2E tests
Deployment: Automated deployment on main branch

Monitoring

Health Checks: /health endpoint with detailed status
Logging: Structured logging with rotation
Metrics: Request count, latency, error rates (via logs)

Local Development Workflow

# 1. Create feature branch
git checkout -b feature/new-feature

# 2. Make changes and test
pytest -v

# 3. Check code quality
black .
isort .
flake8 .

# 4. Commit and push
git add .
git commit -m "Add new feature"
git push origin feature/new-feature

# 5. Create PR (CI will run automatically)

🗂️ Project Structure

advanced_rag/
├── src/
│   ├── chains.py              # LLM and RAG chain logic
│   ├── data_processing.py     # Document loading and splitting
│   ├── retrieval.py           # Retrieval and reranking
│   └── memory.py              # Conversation memory system
├── tests/
│   ├── test_integration.py    # Integration tests
│   └── test.py                # Manual test script
├── .github/
│   └── workflows/
│       └── CI.yml             # CI/CD pipeline
├── app.py                     # FastAPI application
├── config.py                  # Configuration management
├── logging_config.py          # Logging setup
├── main.py                    # CLI entry point
├── requirements.txt           # Python dependencies
├── Dockerfile                 # Docker image
├── docker-compose.yml         # Docker Compose config
├── .env.example               # Environment template
├── .gitignore                 # Git ignore rules
└── README.md                  # This file

🔒 Security

API keys stored in environment variables
Rate limiting on all endpoints
Input validation with Pydantic
Security scanning in CI/CD pipeline
No sensitive data in logs

🐛 Troubleshooting

Common Issues

1. Import errors

export PYTHONPATH=.

2. Memory database locked

rm conversation_memory.db

3. Vector store corruption

rm -rf vector_store/
# Restart server to rebuild

4. Docker health check failing

docker-compose logs rag-api
# Check for initialization errors

📚 Resources

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests and code quality checks
Submit a pull request

🙏 Acknowledgments

LangChain for the RAG framework
Hugging Face for embedding models
FastAPI for the web framework
The open-source community

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
data		data
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.py		config.py
docker-compose.yml		docker-compose.yml
logging_config.py		logging_config.py
main.py		main.py
makefile		makefile
requirements.txt		requirements.txt
test.ipynb		test.ipynb

License

gokhaneraslan/advanced_rag

Folders and files

Latest commit

History

Repository files navigation