An end-to-end pipeline for building, evaluating, and improving a Retrieval-Augmented Generation system.
newbieAR is a self-contained RAG research platform that covers every stage of the RAG lifecycle: document ingestion into both a vector store and a knowledge graph, hybrid retrieval, an agentic layer with tool-calling, synthetic test-case generation, and automated metric-based evaluation. It is designed to be a hands-on learning project and a starting point for experimenting with modern RAG architectures — combining dense vector search (Qdrant or Milvus) with graph-based retrieval (Neo4j via Graphiti), orchestrated by a pydantic-ai agent and evaluated with deepeval.
flowchart TD
A[Raw Documents] --> B[Ingestion]
B --> C[(Qdrant / Milvus\nVector DB)]
B --> D[(Neo4j\nGraph DB)]
C --> E[BasicRAG\nvector search + reranking]
D --> F[GraphRAG\nhybrid BM25 + cosine + BFS]
E --> G[Agentic RAG\npydantic-ai]
F --> G
G --> H[Response + Citations]
A --> I[Synthesis\ndeepeval Synthesizer]
I --> J[Golden Test Cases\nJSON]
J --> K[Evaluation\ndeepeval Metrics]
H --> K
K --> L[Confident AI Dashboard]
newbieAR/
├── src/
│ ├── api/ # FastAPI layer (HTTP + SSE streaming)
│ │ ├── app.py # FastAPI factory, lifespan, CORS
│ │ ├── schemas.py # Pydantic request/response models
│ │ ├── session_store.py # In-memory session management
│ │ └── routers/
│ │ ├── sessions.py # POST/DELETE /sessions
│ │ └── chat.py # POST /chat → SSE stream
│ ├── agents/ # pydantic-ai agentic RAG (agent, tools, deps)
│ ├── deps/ # infra clients: Qdrant, Milvus, Graphiti, OpenAI, CrossEncoder, MinIO
│ ├── evaluation/ # deepeval metrics runner + Bedrock wrapper
│ ├── ingestion/ # vector DB and graph DB ingestion pipelines
│ ├── models/ # pydantic data models (ChunkInfo, RetrievalInfo, etc.)
│ ├── prompts/ # system prompts and generation templates
│ ├── retrieval/ # BasicRAG and GraphRAG implementations
│ ├── synthesis/ # deepeval Synthesizer + golden test case generation
│ └── settings.py # ProjectSettings singleton (pydantic-settings)
├── infras/
│ ├── docker-compose.qdrant.yaml
│ ├── docker-compose.milvus.yaml
│ ├── docker-compose.neo4j.yaml
│ └── docker-compose.minio.yaml
├── tests/ # pytest test suite (asyncio_mode = auto)
├── scripts/ # convenience shell scripts
├── data/ # documents, goldens (git-ignored)
└── pyproject.toml
- Python
>= 3.12 - uv — the only supported package manager
- Docker — for Qdrant (or Milvus), Neo4j, and MinIO
- API keys — OpenAI-compatible LLM + embedding endpoint, AWS credentials (Bedrock), deepeval Confident AI key, Langfuse (optional)
# 1. Clone the repo
git clone https://github.com/your-username/newbieAR.git
cd newbieAR
# 2. Install dependencies
uv sync
# 3. Configure environment
cp .env.example .env
# Fill in all required values — see Configuration section below
# 4. Start infrastructure (pick one vector store)
docker compose -f infras/docker-compose.qdrant.yaml up -d # Qdrant (default)
# docker compose -f infras/docker-compose.milvus.yaml up -d # or Milvus
docker compose -f infras/docker-compose.neo4j.yaml up -d
# 5. Ingest documents
uv run python -m src.ingestion.ingest_vectordb \
--file_path data/papers/files/docling.pdf \
--collection_name research_papers \
--chunk_strategy hybrid
# 6. Ask a question
uv run python -m src.agents.agentic_rag \
--collection_name research_papers --top_k 5| Service | Compose file | Ports | Purpose |
|---|---|---|---|
| Qdrant | docker-compose.qdrant.yaml |
6333, 6334 | Vector store for dense retrieval (default) |
| Milvus | docker-compose.milvus.yaml |
19530, 9091 | Alternative vector store for dense retrieval |
| Neo4j | docker-compose.neo4j.yaml |
7474, 7687 | Graph DB for Graphiti knowledge graph |
| MinIO | docker-compose.minio.yaml |
9000, 9001 | Object storage (optional) |
Pick one vector store backend (Qdrant or Milvus) and set VECTOR_STORE_PROVIDER accordingly in .env.
# Start with Qdrant (default)
docker compose -f infras/docker-compose.qdrant.yaml up -d
# — OR — Start with Milvus (includes etcd + internal MinIO)
docker compose -f infras/docker-compose.milvus.yaml up -d
# Graph DB and object storage
docker compose -f infras/docker-compose.neo4j.yaml up -d
docker compose -f infras/docker-compose.minio.yaml up -dLoads a document, chunks it, embeds, and upserts to the configured vector store (Qdrant or Milvus).
uv run python -m src.ingestion.ingest_vectordb \
--file_path data/papers/files/docling.pdf \
--collection_name research_papers \
--chunk_strategy hybrid # hybrid (default) or hierarchicalLoads a document and adds episodes to Neo4j via Graphiti.
uv run python -m src.ingestion.ingest_graphdb \
--file_path data/papers/files/docling.pdfDense vector search with optional score-threshold filtering and cross-encoder reranking.
uv run python -m src.retrieval.basic_rag \
--collection_name research_papers \
--top_k 10Hybrid BM25 + cosine similarity + BFS over the knowledge graph, reranked with RRF.
uv run python -m src.retrieval.graph_ragA pydantic-ai agent with two tools: search_basic_rag and search_graphiti. Streams responses to the terminal with context and citation display.
uv run python -m src.agents.agentic_rag \
--collection_name research_papers \
--top_k 5Generates (input, expected_output, context) test cases from documents using deepeval's Synthesizer backed by AWS Bedrock.
uv run python -m src.synthesis.synthesize \
--topic paper \ # paper or article
--file_dir data/papers/files \
--output_dir data/goldensRuns deepeval metrics against goldens and writes scores back to the JSON files. Results are also logged to Confident AI.
uv run python -m src.evaluation.evaluate \
--file_dir data/goldens \
--retrieval_window_size 5 \
--collection_name research_papers \
--threshold 0.5Metrics evaluated:
AnswerRelevancyFaithfulnessContextualPrecisionContextualRecallContextualRelevancy
A thin HTTP wrapper around the agentic RAG agent with SSE streaming and in-memory multi-turn sessions.
Start the server:
uvicorn src.api.app:app --host 0.0.0.0 --port 8000 --reloadBase URL: http://localhost:8000
API prefix: /api/v1
| Method | Path | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/api/v1/sessions |
Create a new chat session |
DELETE |
/api/v1/sessions/{session_id} |
Delete a session |
POST |
/api/v1/chat |
Stream a response (SSE) |
POST |
/api/v1/completion |
Run agent and return full response (non-streaming) |
curl -X POST http://localhost:8000/api/v1/sessions \
-H "Content-Type: application/json" \
-d '{"collection_name": "research_papers", "top_k": 5}'{
"session_id": "3f2a1b...",
"collection_name": "research_papers",
"top_k": 5
}curl -X POST http://localhost:8000/api/v1/chat \
-H "Content-Type: application/json" \
-d '{"session_id": "3f2a1b...", "message": "What is docling?"}' \
--no-bufferThe response is a stream of SSE events:
event: delta
data: {"text": "Docling is a "}
event: delta
data: {"text": "document conversion library..."}
event: done
data: {"contexts": ["..."], "citations": ["..."]}
| Event | Payload | Description |
|---|---|---|
delta |
{"text": "..."} |
Incremental text chunk |
done |
{"contexts": [...], "citations": [...]} |
Stream complete; retrieved sources |
error |
{"detail": "..."} |
Error (e.g. session not found) |
curl -X DELETE http://localhost:8000/api/v1/sessions/3f2a1b...curl -X POST http://localhost:8000/api/v1/completion \
-H "Content-Type: application/json" \
-d '{"session_id": "3f2a1b...", "message": "What is docling?"}'{
"text": "Docling is a document conversion library...",
"contexts": ["Docling is designed to..."],
"citations": ["docling.pdf, page 3"]
}uv sync --extra test
uv run pytest tests/ # all tests
uv run pytest tests/retrieval/test_basic_rag.py # single file
uv run pytest tests/agents/test_agentic_rag_tools.py -v # verboseasyncio_mode = "auto" is set in pyproject.toml — async test functions work without decorators.
Copy .env.example to .env and fill in the values below.
| Variable | Group | Description |
|---|---|---|
LLM_MODEL |
LLM | Model name for generation |
LLM_API_KEY |
LLM | API key for OpenAI-compatible endpoint |
LLM_BASE_URL |
LLM | Base URL for OpenAI-compatible endpoint |
EMBEDDING_MODEL |
Embedding | Embedding model name |
EMBEDDING_API_KEY |
Embedding | API key for embedding endpoint |
EMBEDDING_BASE_URL |
Embedding | Base URL for embedding endpoint |
EMBEDDING_DIMENSIONS |
Embedding | Vector dimensionality |
VECTOR_STORE_PROVIDER |
Vector Store | qdrant (default) or milvus |
QDRANT_URI |
Qdrant | Qdrant server URI (e.g. http://localhost:6333) |
QDRANT_API_KEY |
Qdrant | Qdrant API key (optional) |
QDRANT_COLLECTION_NAME |
Qdrant | Default collection name |
MILVUS_URI |
Milvus | Milvus server URI (e.g. http://localhost:19530) |
MILVUS_TOKEN |
Milvus | Milvus auth token (optional) |
MILVUS_COLLECTION_NAME |
Milvus | Default collection name |
GRAPH_DB_URI |
Neo4j | Neo4j Bolt URI (e.g. bolt://localhost:7687) |
GRAPH_DB_USERNAME |
Neo4j | Neo4j username |
GRAPH_DB_PASSWORD |
Neo4j | Neo4j password |
AWS_ACCESS_KEY_ID |
AWS Bedrock | AWS access key |
AWS_SECRET_ACCESS_KEY |
AWS Bedrock | AWS secret key |
CRITIQUE_MODEL_NAME |
AWS Bedrock | Bedrock model ID for synthesis/eval |
CRITIQUE_MODEL_REGION_NAME |
AWS Bedrock | AWS region (e.g. us-east-1) |
CONFIDENT_API_KEY |
deepeval | Confident AI API key for result logging |
LANGFUSE_PUBLIC_KEY |
Langfuse | Langfuse public key (observability) |
LANGFUSE_SECRET_KEY |
Langfuse | Langfuse secret key |
LANGFUSE_BASE_URL |
Langfuse | Langfuse server URL |
| Library | Role |
|---|---|
| FastAPI | HTTP API framework with SSE streaming |
| sse-starlette | Server-Sent Events support for Starlette/FastAPI |
| pydantic-ai | Agentic RAG orchestration and tool calling |
| deepeval | Synthetic data generation and RAG evaluation |
| Qdrant | Vector database for dense retrieval |
| Milvus | Alternative vector database for dense retrieval |
| Graphiti | Knowledge graph construction and retrieval over Neo4j |
| docling | Document loading and conversion |
| sentence-transformers | Cross-encoder reranking |
| loguru | Structured logging |
| pydantic-settings | Settings management from .env |
| uv | Package and environment management |
- Fork the repository and create a feature branch:
git checkout -b features/my-feature - Make your changes and add tests where appropriate
- Open a pull request with a clear description of what changed and why
MIT License — see LICENSE for details.