NyRAG (pronounced as knee-RAG) is a simple tool for building RAG applications by crawling websites or processing documents, then deploying to Vespa for hybrid search with an integrated chat UI.
When a user asks a question, NyRAG performs a multi-stage retrieval process:
- Query Enhancement: An LLM generates additional search queries based on the user's question and initial context to improve retrieval coverage
- Embedding Generation: Each query is converted to embeddings using the configured SentenceTransformer model
- Vespa Search: Queries are executed against Vespa using nearestNeighbor search with the
best_chunk_scoreranking profile to find the most relevant document chunks - Chunk Fusion: Results from all queries are aggregated, deduplicated, and ranked by score to select the top-k most relevant chunks
- Answer Generation: The retrieved context is sent to an LLM (via OpenRouter) which generates a grounded answer based only on the provided chunks
This multi-query RAG approach with chunk-level retrieval ensures answers are comprehensive and grounded in your actual content, whether from crawled websites or processed documents.
pip install nyragWe recommend uv:
uv init --python 3.10
uv venv
uv sync
source .venv/bin/activate
uv pip install -U nyragFor development:
git clone https://github.com/abhishekkrthakur/nyrag.git
cd nyrag
pip install -e .nyrag operates in two deployment modes (Local or Cloud) and two data modes (Web or Docs):
| Deployment | Data Mode | Description |
|---|---|---|
| Local | Web | Crawl websites → Local Vespa Docker |
| Local | Docs | Process documents → Local Vespa Docker |
| Cloud | Web | Crawl websites → Vespa Cloud |
| Cloud | Docs | Process documents → Vespa Cloud |
Runs Vespa in a local Docker container. Great for development and testing.
export NYRAG_LOCAL=1
nyrag --config configs/example.ymlExample config for web crawling:
name: mywebsite
mode: web
start_loc: https://example.com/
exclude:
- https://example.com/admin/*
- https://example.com/private/*
crawl_params:
respect_robots_txt: true
follow_subdomains: true
user_agent_type: chrome
rag_params:
embedding_model: sentence-transformers/all-MiniLM-L6-v2
chunk_size: 1024
chunk_overlap: 50export NYRAG_LOCAL=1
nyrag --config configs/doc_example.ymlExample config for document processing:
name: mydocs
mode: docs
start_loc: /path/to/documents/
exclude:
- "*.csv"
doc_params:
recursive: true
file_extensions:
- .pdf
- .docx
- .txt
- .md
rag_params:
embedding_model: sentence-transformers/all-mpnet-base-v2
chunk_size: 512
chunk_overlap: 50After crawling/processing is complete:
export NYRAG_CONFIG=configs/example.yml
export OPENROUTER_API_KEY=your-api-key
export OPENROUTER_MODEL=openai/gpt-5.1
uvicorn nyrag.api:app --host 0.0.0.0 --port 8000Open http://localhost:8000/chat
Deploys to Vespa Cloud for production use.
export NYRAG_LOCAL=0
export VESPA_CLOUD_TENANT=your-tenant
nyrag --config configs/example.ymlexport NYRAG_LOCAL=0
export VESPA_CLOUD_TENANT=your-tenant
nyrag --config configs/doc_example.ymlAfter crawling/processing is complete:
export NYRAG_CONFIG=configs/example.yml
export VESPA_URL="https://<your-endpoint>.z.vespa-app.cloud"
export OPENROUTER_API_KEY=your-api-key
export OPENROUTER_MODEL=openai/gpt-5.1
uvicorn nyrag.api:app --host 0.0.0.0 --port 8000Open http://localhost:8000/chat
| Parameter | Type | Default | Description |
|---|---|---|---|
respect_robots_txt |
bool | true |
Respect robots.txt rules |
aggressive_crawl |
bool | false |
Faster crawling with more concurrent requests |
follow_subdomains |
bool | true |
Follow links to subdomains |
strict_mode |
bool | false |
Only crawl URLs matching start pattern |
user_agent_type |
str | chrome |
chrome, firefox, safari, mobile, bot |
custom_user_agent |
str | None |
Custom user agent string |
allowed_domains |
list | None |
Explicitly allowed domains |
| Parameter | Type | Default | Description |
|---|---|---|---|
recursive |
bool | true |
Process subdirectories |
include_hidden |
bool | false |
Include hidden files |
follow_symlinks |
bool | false |
Follow symbolic links |
max_file_size_mb |
float | None |
Max file size in MB |
file_extensions |
list | None |
Only process these extensions |
| Parameter | Type | Default | Description |
|---|---|---|---|
embedding_model |
str | sentence-transformers/all-MiniLM-L6-v2 |
Embedding model |
embedding_dim |
int | 384 |
Embedding dimension |
chunk_size |
int | 1024 |
Chunk size for text splitting |
chunk_overlap |
int | 50 |
Overlap between chunks |
distance_metric |
str | angular |
Distance metric |
max_tokens |
int | 8192 |
Max tokens per document |
| Variable | Description |
|---|---|
NYRAG_LOCAL |
1 for local Docker, 0 for Vespa Cloud |
| Variable | Description |
|---|---|
NYRAG_VESPA_DOCKER_IMAGE |
Docker image (default: vespaengine/vespa:latest) |
| Variable | Description |
|---|---|
VESPA_CLOUD_TENANT |
Your Vespa Cloud tenant |
VESPA_CLOUD_APPLICATION |
Application name (optional) |
VESPA_CLOUD_INSTANCE |
Instance name (default: default) |
VESPA_CLOUD_API_KEY_PATH |
Path to API key file |
VESPA_CLIENT_CERT |
Path to mTLS certificate |
VESPA_CLIENT_KEY |
Path to mTLS private key |
| Variable | Description |
|---|---|
NYRAG_CONFIG |
Path to config file |
VESPA_URL |
Vespa endpoint URL (optional for local, required for cloud) |
OPENROUTER_API_KEY |
OpenRouter API key for LLM |
OPENROUTER_MODEL |
LLM model (e.g., openai/gpt-4o) |
