Skip to content

πŸ”πŸ€– A fast and flexible RAG tool for indexing and querying documents

License

Notifications You must be signed in to change notification settings

statico/quickrag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

42 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

QuickRAG

Build Release License: Unlicense

A fast RAG tool that indexes documents using your choice of embedding provider and stores them in LanceDB for efficient similarity search.

Quick Example

# Create a config
$ quickrag init

# Index documents
$ quickrag index gutenberg/ --output gutenberg.rag
βœ” Parsing documents from gutenberg/... (using recursive-token chunker)
βœ” Detecting embedding dimensions
βœ” Initializing database
βœ” Finding files to index
βœ” Removing deleted files from index
βœ” Preparing for indexing
βœ” Indexing files
βœ” Finalizing
Indexing complete! Processed 622 chunks across 2 files. Removed 1 deleted file.
Added 619 new chunks (3 already existed). Total chunks in database: 619

# Search
$ quickrag query gutenberg.rag "Who is Sherlock Holmes?"

Features

  • Multiple embedding providers (VoyageAI, OpenAI, Ollama)
  • Token-based recursive chunking (default) or character-based chunking
  • LanceDB vector storage with persistent .rag files
  • Idempotent indexing (tracks indexed files, skips unchanged)
  • Automatic cleanup of deleted files from index
  • UTF-8 sanitization for PDF conversions
  • TypeScript & Bun

Installation

Homebrew (macOS)

brew install statico/quickrag/quickrag

Download Binary

# macOS (Apple Silicon)
curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-darwin-arm64 -o /usr/local/bin/quickrag
chmod +x /usr/local/bin/quickrag

# macOS (Intel)
curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-darwin-x64 -o /usr/local/bin/quickrag
chmod +x /usr/local/bin/quickrag

# Linux (ARM64)
curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-linux-arm64 -o /usr/local/bin/quickrag
chmod +x /usr/local/bin/quickrag

# Linux (x64)
curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-linux-x64 -o /usr/local/bin/quickrag
chmod +x /usr/local/bin/quickrag

Note: macOS binaries are not codesigned. You may need to run xattr -d com.apple.quarantine /usr/local/bin/quickrag to bypass Gatekeeper.

Build from Source

Requires Bun.

git clone https://github.com/statico/quickrag.git
cd quickrag
bun install
bun run dev --help

Quick Start

1. Initialize Configuration

quickrag init

This creates ~/.config/quickrag/config.yaml:

provider: ollama
model: nomic-embed-text
baseUrl: http://localhost:11434
chunking:
  strategy: recursive-token
  chunkSize: 500
  chunkOverlap: 50
  minChunkSize: 50

2. Configure Settings

Edit ~/.config/quickrag/config.yaml to set API keys and preferences:

provider: openai
apiKey: sk-your-key-here
model: text-embedding-3-small
chunking:
  strategy: recursive-token
  chunkSize: 500
  chunkOverlap: 50
  minChunkSize: 50

3. Index Documents

quickrag index ./documents --output my-docs.rag

4. Query

quickrag query my-docs.rag "What is the main topic?"

Configuration

Configuration Options:

  • provider: Embedding provider (openai, voyageai, or ollama)
  • apiKey: API key (can also use environment variables)
  • model: Model name for the embedding provider
  • baseUrl: Base URL for Ollama (default: http://localhost:11434)
  • chunking.strategy: recursive-token (default) or simple
  • chunking.chunkSize: Tokens (for recursive-token, default: 500) or characters (for simple, default: 1000)
  • chunking.chunkOverlap: Tokens (for recursive-token, default: 50) or characters (for simple, default: 200)
  • chunking.minChunkSize: Minimum chunk size in tokens (default: 50). Chunks smaller than this are filtered out to prevent tiny fragments.

Chunking Strategies

Recursive Token Chunker (Default)

Token-based splitting that respects semantic boundaries. Splits at paragraph breaks, line breaks, sentence endings, then word boundaries. Chunks are sized by estimated tokens (default: 500), aligning with embedding model expectations. Maintains configurable overlap (default: 50 tokens, ~10%).

Simple Chunker

Character-based chunking for backward compatibility. Chunks are sized by characters (default: 1000) with sentence boundary detection. Overlap is character-based (default: 200).

Performance Comparison

Benchmarked on test corpus (2 files: sherlock-holmes.txt, frankenstein.txt):

Metric Recursive Token Simple
Chunks Created 622 chunks 2,539 chunks (4.1x more)
Indexing Time ~19 seconds ~37 seconds
Query Quality βœ… Better semantic matches, more context ⚠️ More fragments, some irrelevant results

Recommendation: Use recursive-token for production. The indexing time difference is negligible compared to improved retrieval quality.

Tuning Recommendations

Most Use Cases:

  • strategy: recursive-token
  • chunkSize: 400-512 (tokens) - Research-backed sweet spot for 85-90% recall
  • chunkOverlap: 50-100 (tokens, ~10-20%)

Technical Documentation:

  • strategy: recursive-token
  • chunkSize: 500-600 (tokens)
  • chunkOverlap: 75-100 (tokens)

Narrative Text:

  • strategy: recursive-token
  • chunkSize: 400-500 (tokens)
  • chunkOverlap: 50-75 (tokens)

Academic Papers:

  • strategy: recursive-token
  • chunkSize: 600-800 (tokens)
  • chunkOverlap: 100-150 (tokens)

Usage

Indexing

# Basic indexing
quickrag index ./documents --output my-docs.rag

# Override chunking parameters
quickrag index ./documents --chunker recursive-token --chunk-size 500 --chunk-overlap 50 --min-chunk-size 50 --output my-docs.rag

# Use different provider
quickrag index ./documents --provider openai --model text-embedding-3-small --output my-docs.rag

# Clear existing index
quickrag index ./documents --clear --output my-docs.rag

Note: QuickRAG automatically detects and removes deleted files from the index. If a file was previously indexed but no longer exists in the source directory, it will be removed from the database during the next indexing run.

Querying

quickrag query my-docs.rag "What is the main topic?"

Interactive Mode

quickrag interactive my-docs.rag

Embedding Providers

VoyageAI

provider: voyageai
apiKey: your-voyage-api-key
model: voyage-3

OpenAI

provider: openai
apiKey: sk-your-openai-key
model: text-embedding-3-small

Ollama

provider: ollama
model: nomic-embed-text
baseUrl: http://localhost:11434

Supported File Types

  • .txt - Plain text files
  • .md - Markdown files
  • .markdown - Markdown files

Development

bun install
bun run dev index ./documents --provider ollama --output test.rag
bun run build
bun run typecheck

Requirements

  • Bun >= 1.0.0
  • TypeScript >= 5.0.0
  • For Ollama: A running Ollama instance with an embedding model installed (e.g., ollama pull nomic-embed-text)

License

This is free and unencumbered software released into the public domain.

For more information, see UNLICENSE or visit https://unlicense.org