🤖 Agentic RAG API

A powerful Retrieval-Augmented Generation (RAG) system built with CrewAI agents, featuring a REST API and a beautiful Gradio web interface. This project demonstrates how to create intelligent AI agents that can research queries, synthesize information, and provide comprehensive responses.

🌟 Features

Multi-Agent Architecture: Researcher and Writer agents working together
Web Search Integration: Powered by SerperDev tools for real-time information
REST API: Clean API endpoints using LitServe
Beautiful Web UI: Interactive Gradio interface
Command Line Client: Simple CLI for testing and automation
Production Ready: Proper error handling and logging

🔒 Privacy & Cost Benefits

This project leverages Qwen 3 4B via Ollama as a local language model, providing:

🔒 Complete Data Privacy - All inference happens on your machine
💰 Zero API Costs - No charges for LLM calls
⚡ Fast Responses - No network latency
🌐 Offline Capability - Works without internet (except web search)

🏗️ Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Web Client    │    │   REST API       │    │   CrewAI Agents │
│   (Gradio UI)   │◄──►│   (LitServe)     │◄──►│   (Researcher   │
│                 │    │                  │    │    + Writer)    │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                │
                                ▼
                       ┌──────────────────┐
                       │   SerperDev      │
                       │   Web Search     │
                       └──────────────────┘

🤖 How the Agents Work

This project uses a multi-agent architecture powered by CrewAI, where specialized AI agents collaborate to provide intelligent responses:

Agent Roles and Workflow

Researcher Agent 🔍
- Role: Information gathering and analysis
- Tools: SerperDev web search integration
- Goal: Research the user's query and generate comprehensive insights
- Process:
  - Receives the user's query
  - Uses web search to find relevant, up-to-date information
  - Analyzes and synthesizes findings
  - Produces research insights and context
Writer Agent ✍️
- Role: Response synthesis and communication
- Goal: Transform research insights into clear, informative responses
- Process:
  - Takes the researcher's insights as input
  - Crafts a concise, well-structured response
  - Ensures the response is accurate and easy to understand
  - Formats the final answer for the user

The Collaboration Process

User Query → Researcher Agent → Web Search → Analysis → Writer Agent → Final Response
     ↓              ↓                    ↓         ↓           ↓            ↓
   "What is     Researches &         Searches   Synthesizes  Writes      "Machine
   machine      gathers info          current    research    clear       learning
   learning?"   from web              data       results     response"   is..."

Key Features

Sequential Processing: Agents work in sequence, each building on the previous agent's work
Tool Integration: Real-time web search for current information
Context Preservation: Information flows from researcher to writer seamlessly
Quality Assurance: Each agent has specialized expertise for their role
Error Handling: Robust error handling ensures reliable responses

Technical Implementation

Framework: CrewAI for agent orchestration
Language Model: Ollama Qwen 3 4B for intelligent responses
Tools: SerperDev for web search capabilities
API: LitServe for production-ready serving
UI: Gradio for beautiful web interface

🚀 Quick Start

Prerequisites

Python 3.11 or higher
uv package manager (recommended)
Internet connection for web search functionality

Installation

Clone the repository

git clone <your-repo-url>
cd deploy-agentic-rag

Install dependencies
```
uv sync
```

Set up environment variables

# Copy the example environment file
cp .env.example .env

# Edit .env and add your actual API key
# Get your SerperDev API key from: https://serper.dev
SERPER_API_KEY="your-actual-serper-api-key-here"

Run the Application

Start the API server
```
uv run python server.py
```
The API will be available at http://localhost:8000
Start the web interface (in a new terminal)
```
uv run python gradio_ui.py
```
The web UI will be available at http://localhost:7860

Test with the CLI client

uv run python client.py --query "What is machine learning?"

📱 Screenshots

Agentic RAG Interface showcasing intelligent AI responses to complex queries about AI breakthroughs

Usage

Web Interface

Open http://localhost:7860 in your browser
Enter your query in the text box
Click "Submit Query" or press Enter
View the intelligent response generated by the AI agents

REST API

Endpoint: POST /predict

Request:

{
  "query": "Explain quantum computing in simple terms"
}

Response:

{
  "output": {
    "raw": "Quantum computing is a type of computing that uses quantum mechanics...",
    "tasks_output": [...],
    "token_usage": {...}
  }
}

Command Line

# Simple query
uv run python client.py --query "What is the capital of France?"

# The client displays responses with a typewriter effect

🔧 Configuration

API Configuration

Server URL: http://localhost:8000
Timeout: 60 seconds
Model: Ollama Qwen 3 4B (configurable in server.py)

Agent Configuration

Researcher Agent:

Role: Research and gather information
Tools: SerperDev web search
Goal: Generate comprehensive insights

Writer Agent:

Role: Synthesize information into clear responses
Goal: Provide concise, informative answers

📁 Project Structure

deploy-agentic-rag/
├── server.py           # Main API server using LitServe
├── client.py           # Command-line client
├── gradio_ui.py        # Web interface
├── pyproject.toml      # Project configuration
└── README.md          # This file

🛠️ Development

Adding New Features

Custom Agents: Modify server.py to add new agent types
Additional Tools: Extend the Researcher agent with more tools
UI Enhancements: Customize gradio_ui.py for new features

Testing

# Run tests
uv run pytest

# Code formatting
uv run black .
uv run flake8 .

🌐 API Reference

Health Check

GET /health - Check server status

Predictions

POST /predict - Submit queries for processing

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

CrewAI for the amazing agent framework
LitServe for the lightweight API serving
Gradio for the beautiful web interface
SerperDev for web search capabilities

Made with ❤️ using CrewAI Agents

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
client.py		client.py
gradio_ui.py		gradio_ui.py
pyproject.toml		pyproject.toml
screenshot.png		screenshot.png
server.py		server.py
uv.lock		uv.lock

Uh oh!

License

Uh oh!

EloiRamos/deploy-agentic-rag

Folders and files

Latest commit

History

Repository files navigation

🤖 Agentic RAG API

🌟 Features

🔒 Privacy & Cost Benefits

🏗️ Architecture

🤖 How the Agents Work

Agent Roles and Workflow

The Collaboration Process

Key Features

Technical Implementation

🚀 Quick Start

Prerequisites

Installation

Run the Application

📱 Screenshots

Usage

Web Interface

REST API

Command Line

🔧 Configuration

API Configuration

Agent Configuration

📁 Project Structure

🛠️ Development

Adding New Features

Testing

🌐 API Reference

Health Check

Predictions

🤝 Contributing

📝 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages