newbieAR — Newbie Agentic RAG

An end-to-end pipeline for building, evaluating, and improving a Retrieval-Augmented Generation system.

Overview

newbieAR is a self-contained RAG research platform that covers every stage of the RAG lifecycle: document ingestion into both a vector store and a knowledge graph, hybrid retrieval, an agentic layer with tool-calling, synthetic test-case generation, and automated metric-based evaluation. It is designed to be a hands-on learning project and a starting point for experimenting with modern RAG architectures — combining dense vector search (Qdrant or Milvus) with graph-based retrieval (Neo4j via Graphiti), orchestrated by a pydantic-ai agent and evaluated with deepeval.

Pipeline Architecture

flowchart TD
    A[Raw Documents] --> B[Ingestion]
    B --> C[(Qdrant / Milvus\nVector DB)]
    B --> D[(Neo4j\nGraph DB)]
    C --> E[BasicRAG\nvector search + reranking]
    D --> F[GraphRAG\nhybrid BM25 + cosine + BFS]
    E --> G[Agentic RAG\npydantic-ai]
    F --> G
    G --> H[Response + Citations]
    A --> I[Synthesis\ndeepeval Synthesizer]
    I --> J[Golden Test Cases\nJSON]
    J --> K[Evaluation\ndeepeval Metrics]
    H --> K
    K --> L[Confident AI Dashboard]

Project Structure

newbieAR/
├── src/
│   ├── api/             # FastAPI layer (HTTP + SSE streaming)
│   │   ├── app.py       # FastAPI factory, lifespan, CORS
│   │   ├── schemas.py   # Pydantic request/response models
│   │   ├── session_store.py  # In-memory session management
│   │   └── routers/
│   │       ├── sessions.py   # POST/DELETE /sessions
│   │       └── chat.py       # POST /chat → SSE stream
│   ├── agents/          # pydantic-ai agentic RAG (agent, tools, deps)
│   ├── deps/            # infra clients: Qdrant, Milvus, Graphiti, OpenAI, CrossEncoder, MinIO
│   ├── evaluation/      # deepeval metrics runner + Bedrock wrapper
│   ├── ingestion/       # vector DB and graph DB ingestion pipelines
│   ├── models/          # pydantic data models (ChunkInfo, RetrievalInfo, etc.)
│   ├── prompts/         # system prompts and generation templates
│   ├── retrieval/       # BasicRAG and GraphRAG implementations
│   ├── synthesis/       # deepeval Synthesizer + golden test case generation
│   └── settings.py      # ProjectSettings singleton (pydantic-settings)
├── infras/
│   ├── docker-compose.qdrant.yaml
│   ├── docker-compose.milvus.yaml
│   ├── docker-compose.neo4j.yaml
│   └── docker-compose.minio.yaml
├── tests/               # pytest test suite (asyncio_mode = auto)
├── scripts/             # convenience shell scripts
├── data/                # documents, goldens (git-ignored)
└── pyproject.toml

Prerequisites

Python >= 3.12
uv — the only supported package manager
Docker — for Qdrant (or Milvus), Neo4j, and MinIO
API keys — OpenAI-compatible LLM + embedding endpoint, AWS credentials (Bedrock), deepeval Confident AI key, Langfuse (optional)

Quick Start

# 1. Clone the repo
git clone https://github.com/your-username/newbieAR.git
cd newbieAR

# 2. Install dependencies
uv sync

# 3. Configure environment
cp .env.example .env
# Fill in all required values — see Configuration section below

# 4. Start infrastructure (pick one vector store)
docker compose -f infras/docker-compose.qdrant.yaml up -d   # Qdrant (default)
# docker compose -f infras/docker-compose.milvus.yaml up -d # or Milvus
docker compose -f infras/docker-compose.neo4j.yaml up -d

# 5. Ingest documents
uv run python -m src.ingestion.ingest_vectordb \
  --file_path data/papers/files/docling.pdf \
  --collection_name research_papers \
  --chunk_strategy hybrid

# 6. Ask a question
uv run python -m src.agents.agentic_rag \
  --collection_name research_papers --top_k 5

Infrastructure

Service	Compose file	Ports	Purpose
Qdrant	`docker-compose.qdrant.yaml`	6333, 6334	Vector store for dense retrieval (default)
Milvus	`docker-compose.milvus.yaml`	19530, 9091	Alternative vector store for dense retrieval
Neo4j	`docker-compose.neo4j.yaml`	7474, 7687	Graph DB for Graphiti knowledge graph
MinIO	`docker-compose.minio.yaml`	9000, 9001	Object storage (optional)

Pick one vector store backend (Qdrant or Milvus) and set VECTOR_STORE_PROVIDER accordingly in .env.

# Start with Qdrant (default)
docker compose -f infras/docker-compose.qdrant.yaml up -d

# — OR — Start with Milvus (includes etcd + internal MinIO)
docker compose -f infras/docker-compose.milvus.yaml up -d

# Graph DB and object storage
docker compose -f infras/docker-compose.neo4j.yaml up -d
docker compose -f infras/docker-compose.minio.yaml up -d

Usage

1. Ingest — Vector DB

Loads a document, chunks it, embeds, and upserts to the configured vector store (Qdrant or Milvus).

uv run python -m src.ingestion.ingest_vectordb \
  --file_path data/papers/files/docling.pdf \
  --collection_name research_papers \
  --chunk_strategy hybrid        # hybrid (default) or hierarchical

2. Ingest — Graph DB

Loads a document and adds episodes to Neo4j via Graphiti.

uv run python -m src.ingestion.ingest_graphdb \
  --file_path data/papers/files/docling.pdf

3. Retrieve — BasicRAG (interactive CLI)

Dense vector search with optional score-threshold filtering and cross-encoder reranking.

uv run python -m src.retrieval.basic_rag \
  --collection_name research_papers \
  --top_k 10

4. Retrieve — GraphRAG (interactive CLI)

Hybrid BM25 + cosine similarity + BFS over the knowledge graph, reranked with RRF.

uv run python -m src.retrieval.graph_rag

5. Agentic RAG (streaming, multi-turn)

A pydantic-ai agent with two tools: search_basic_rag and search_graphiti. Streams responses to the terminal with context and citation display.

uv run python -m src.agents.agentic_rag \
  --collection_name research_papers \
  --top_k 5

6. Synthesize Golden Test Cases

Generates (input, expected_output, context) test cases from documents using deepeval's Synthesizer backed by AWS Bedrock.

uv run python -m src.synthesis.synthesize \
  --topic paper \                        # paper or article
  --file_dir data/papers/files \
  --output_dir data/goldens

7. Evaluate

Runs deepeval metrics against goldens and writes scores back to the JSON files. Results are also logged to Confident AI.

uv run python -m src.evaluation.evaluate \
  --file_dir data/goldens \
  --retrieval_window_size 5 \
  --collection_name research_papers \
  --threshold 0.5

Metrics evaluated:

AnswerRelevancy
Faithfulness
ContextualPrecision
ContextualRecall
ContextualRelevancy

8. FastAPI Server (HTTP + SSE)

A thin HTTP wrapper around the agentic RAG agent with SSE streaming and in-memory multi-turn sessions.

Start the server:

uvicorn src.api.app:app --host 0.0.0.0 --port 8000 --reload

Base URL: http://localhost:8000 API prefix: /api/v1

Endpoints

Method	Path	Description
`GET`	`/health`	Health check
`POST`	`/api/v1/sessions`	Create a new chat session
`DELETE`	`/api/v1/sessions/{session_id}`	Delete a session
`POST`	`/api/v1/chat`	Stream a response (SSE)
`POST`	`/api/v1/completion`	Run agent and return full response (non-streaming)

Create a session

curl -X POST http://localhost:8000/api/v1/sessions \
  -H "Content-Type: application/json" \
  -d '{"collection_name": "research_papers", "top_k": 5}'

{
  "session_id": "3f2a1b...",
  "collection_name": "research_papers",
  "top_k": 5
}

Stream a chat message

curl -X POST http://localhost:8000/api/v1/chat \
  -H "Content-Type: application/json" \
  -d '{"session_id": "3f2a1b...", "message": "What is docling?"}' \
  --no-buffer

The response is a stream of SSE events:

event: delta
data: {"text": "Docling is a "}

event: delta
data: {"text": "document conversion library..."}

event: done
data: {"contexts": ["..."], "citations": ["..."]}

Event	Payload	Description
`delta`	`{"text": "..."}`	Incremental text chunk
`done`	`{"contexts": [...], "citations": [...]}`	Stream complete; retrieved sources
`error`	`{"detail": "..."}`	Error (e.g. session not found)

Delete a session

curl -X DELETE http://localhost:8000/api/v1/sessions/3f2a1b...

Get a completion (non-streaming)

curl -X POST http://localhost:8000/api/v1/completion \
  -H "Content-Type: application/json" \
  -d '{"session_id": "3f2a1b...", "message": "What is docling?"}'

{
  "text": "Docling is a document conversion library...",
  "contexts": ["Docling is designed to..."],
  "citations": ["docling.pdf, page 3"]
}

Running Tests

uv sync --extra test

uv run pytest tests/                                          # all tests
uv run pytest tests/retrieval/test_basic_rag.py              # single file
uv run pytest tests/agents/test_agentic_rag_tools.py -v      # verbose

asyncio_mode = "auto" is set in pyproject.toml — async test functions work without decorators.

Configuration

Copy .env.example to .env and fill in the values below.

Variable	Group	Description
`LLM_MODEL`	LLM	Model name for generation
`LLM_API_KEY`	LLM	API key for OpenAI-compatible endpoint
`LLM_BASE_URL`	LLM	Base URL for OpenAI-compatible endpoint
`EMBEDDING_MODEL`	Embedding	Embedding model name
`EMBEDDING_API_KEY`	Embedding	API key for embedding endpoint
`EMBEDDING_BASE_URL`	Embedding	Base URL for embedding endpoint
`EMBEDDING_DIMENSIONS`	Embedding	Vector dimensionality
`VECTOR_STORE_PROVIDER`	Vector Store	`qdrant` (default) or `milvus`
`QDRANT_URI`	Qdrant	Qdrant server URI (e.g. `http://localhost:6333`)
`QDRANT_API_KEY`	Qdrant	Qdrant API key (optional)
`QDRANT_COLLECTION_NAME`	Qdrant	Default collection name
`MILVUS_URI`	Milvus	Milvus server URI (e.g. `http://localhost:19530`)
`MILVUS_TOKEN`	Milvus	Milvus auth token (optional)
`MILVUS_COLLECTION_NAME`	Milvus	Default collection name
`GRAPH_DB_URI`	Neo4j	Neo4j Bolt URI (e.g. `bolt://localhost:7687`)
`GRAPH_DB_USERNAME`	Neo4j	Neo4j username
`GRAPH_DB_PASSWORD`	Neo4j	Neo4j password
`AWS_ACCESS_KEY_ID`	AWS Bedrock	AWS access key
`AWS_SECRET_ACCESS_KEY`	AWS Bedrock	AWS secret key
`CRITIQUE_MODEL_NAME`	AWS Bedrock	Bedrock model ID for synthesis/eval
`CRITIQUE_MODEL_REGION_NAME`	AWS Bedrock	AWS region (e.g. `us-east-1`)
`CONFIDENT_API_KEY`	deepeval	Confident AI API key for result logging
`LANGFUSE_PUBLIC_KEY`	Langfuse	Langfuse public key (observability)
`LANGFUSE_SECRET_KEY`	Langfuse	Langfuse secret key
`LANGFUSE_BASE_URL`	Langfuse	Langfuse server URL

Tech Stack

Library	Role
FastAPI	HTTP API framework with SSE streaming
sse-starlette	Server-Sent Events support for Starlette/FastAPI
pydantic-ai	Agentic RAG orchestration and tool calling
deepeval	Synthetic data generation and RAG evaluation
Qdrant	Vector database for dense retrieval
Milvus	Alternative vector database for dense retrieval
Graphiti	Knowledge graph construction and retrieval over Neo4j
docling	Document loading and conversion
sentence-transformers	Cross-encoder reranking
loguru	Structured logging
pydantic-settings	Settings management from `.env`
uv	Package and environment management

Contributing

Fork the repository and create a feature branch: git checkout -b features/my-feature
Make your changes and add tests where appropriate
Open a pull request with a clear description of what changed and why

License

MIT License — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.claude/skills		.claude/skills
.vscode		.vscode
data		data
docs		docs
examples		examples
infras		infras
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

newbieAR — Newbie Agentic RAG

Overview

Pipeline Architecture

Project Structure

Prerequisites

Quick Start

Infrastructure

Usage

1. Ingest — Vector DB

2. Ingest — Graph DB

3. Retrieve — BasicRAG (interactive CLI)

4. Retrieve — GraphRAG (interactive CLI)

5. Agentic RAG (streaming, multi-turn)

6. Synthesize Golden Test Cases

7. Evaluate

8. FastAPI Server (HTTP + SSE)

Endpoints

Create a session

Stream a chat message

Delete a session

Get a completion (non-streaming)

Running Tests

Configuration

Tech Stack

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

newbieAR — Newbie Agentic RAG

Overview

Pipeline Architecture

Project Structure

Prerequisites

Quick Start

Infrastructure

Usage

1. Ingest — Vector DB

2. Ingest — Graph DB

3. Retrieve — BasicRAG (interactive CLI)

4. Retrieve — GraphRAG (interactive CLI)

5. Agentic RAG (streaming, multi-turn)

6. Synthesize Golden Test Cases

7. Evaluate

8. FastAPI Server (HTTP + SSE)

Endpoints

Create a session

Stream a chat message

Delete a session

Get a completion (non-streaming)

Running Tests

Configuration

Tech Stack

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages