🚀 VectorWise - Scalable ANN Search Engine

VectorWise is a high-performance Approximate Nearest Neighbor (ANN) search service built with Faiss and FastAPI. It provides lightning-fast vector similarity search over 1 million 128-dimensional vectors using an optimized HNSW (Hierarchical Navigable Small World) index.

� System Overview

Architecture

Backend: FastAPI (Python 3.11)
Vector Search: Faiss HNSW Index
Dataset: 1M vectors, 128 dimensions
Deployment: Docker + Docker Compose
API Port: 8000

Key Features

✅ Sub-10ms average query latency
✅ 95%+ Recall@10 accuracy
✅ REST API with automatic documentation
✅ Containerized deployment
✅ Health checks and monitoring

🏗️ Project Structure

VectorWise/
├── api/
│   └── main.py              # FastAPI application
├── generate_data.py         # Data generation & index building
├── benchmark.py             # Performance measurement script
├── Dockerfile               # Container image definition
├── docker-compose.yml       # Service orchestration
├── requirements.txt         # Python dependencies
├── vectors.npy              # Generated vectors (created by generate_data.py)
├── index.faiss              # HNSW index (created by generate_data.py)
└── README.md                # This file

🚀 Quick Start

Prerequisites

Python 3.11+
Docker & Docker Compose
2GB+ RAM

Step 1: Generate Data and Build Index

First, generate the synthetic vectors and build the Faiss index:

# Install dependencies
pip install -r requirements.txt

# Generate 1M vectors and build HNSW index
python generate_data.py

This creates:

vectors.npy - 1 million 128-dimensional vectors (~500 MB)
index.faiss - Optimized HNSW index (~600 MB)

HNSW Parameters Used:

M = 32 - Number of connections per layer
efConstruction = 200 - Build-time accuracy parameter

Step 2: Launch the Service

Using Docker Compose (recommended):

# Build and start the service
docker-compose up --build -d

# Check service status
docker-compose ps

# View logs
docker-compose logs -f vectorwise

Alternative - Run locally without Docker:

uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload

Step 3: Test the API

# Health check
curl http://localhost:8000/

# Search for nearest neighbors
curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{
    "query_vector": [0.1, 0.2, ..., 0.5],
    "k": 10
  }'

Step 4: Run Benchmarks

# Ensure service is running first
python benchmark.py

📊 API Documentation

Endpoints

`GET /`

Health check endpoint.

Response:

{
  "service": "VectorWise",
  "status": "healthy",
  "vectors_indexed": 1000000
}

`POST /search`

Perform k-NN search.

Request Body:

{
  "query_vector": [float array of 128 dimensions],
  "k": 10
}

Response:

{
  "indices": [123, 456, 789, ...],
  "distances": [0.123, 0.145, 0.167, ...]
}

`GET /stats`

Get index statistics.

Response:

{
  "total_vectors": 1000000,
  "dimension": 128,
  "index_type": "IndexHNSWFlat",
  "hnsw_m": 32,
  "hnsw_efSearch": 64
}

Interactive API Documentation

Once the service is running, visit:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

⚡ Performance Metrics

Performance measured on a dataset of 1M vectors (128-dim) using 1000 test queries:

Latency Results

Metric	Value
Average Latency	~4-6 ms
Median Latency	~4 ms
P95 Latency	~8 ms
P99 Latency	~12 ms

Recall Results

Metric	Value
Recall@10	95-98%
Minimum Recall	90%
Maximum Recall	100%

HNSW Parameter Optimization

The following parameters were optimized to achieve 95%+ Recall@10 while maintaining low latency:

Parameter	Value	Impact
`M`	32	Number of bi-directional links per node. Higher = better recall, more memory
`efConstruction`	200	Size of candidate list during index build. Higher = better quality, slower build
`efSearch`	64	Size of candidate list during search. Higher = better recall, slower search

Performance Trade-offs

Latency vs. Recall:

Increasing efSearch improves recall but increases query latency
Current configuration (efSearch=64) provides optimal balance
For use cases requiring <5ms latency, reduce efSearch to 32-48
For use cases requiring >98% recall, increase efSearch to 100+

Memory vs. Speed:

HNSW index (~600 MB) fits in memory for fast access
Index size scales with M parameter and dataset size
Alternative: Use IndexIVFFlat for lower memory, slightly higher latency

🔧 Configuration

Adjusting HNSW Parameters

To tune performance, modify generate_data.py:

M = 32                 # Increase for better recall (16-64 range)
EF_CONSTRUCTION = 200  # Increase for better index quality (100-500 range)

To adjust search-time parameters, modify api/main.py:

index.hnsw.efSearch = 64  # Increase for better recall (32-200 range)

Docker Configuration

Adjust resources in docker-compose.yml:

services:
  vectorwise:
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: "2"

🧪 Testing

Unit Tests

# Run API tests
pytest tests/

# Run with coverage
pytest --cov=api tests/

Load Testing

# Using Apache Bench
ab -n 1000 -c 10 -p query.json -T application/json \
  http://localhost:8000/search

# Using wrk
wrk -t4 -c100 -d30s --latency \
  -s search.lua http://localhost:8000/search

📦 Docker Commands

# Build the image
docker-compose build

# Start the service
docker-compose up -d

# Stop the service
docker-compose down

# View logs
docker-compose logs -f

# Restart the service
docker-compose restart

# Remove containers and volumes
docker-compose down -v

🐛 Troubleshooting

Issue: Index file not found

FileNotFoundError: index.faiss

Solution: Run python generate_data.py first to create the index.

Issue: Out of memory

MemoryError or Container killed (OOM)

Solution: Reduce dataset size or increase Docker memory limits.

Issue: Slow queries

Latency > 50ms

Solution:

Reduce efSearch parameter
Check system resources (CPU, memory)
Ensure index is loaded in memory

Issue: Low recall

Recall@10 < 95%

Solution:

Increase efSearch in api/main.py
Rebuild index with higher efConstruction value
Increase M parameter for denser graph

� Scaling Considerations

Horizontal Scaling

Deploy multiple instances behind a load balancer (Nginx, HAProxy)
Each instance loads the same read-only index
Use sticky sessions if needed

Index Updates

For static datasets: Rebuild index periodically offline
For dynamic datasets: Consider online index update strategies
Use Faiss IndexIDMap for delete/update operations

Production Recommendations

Use GPU-accelerated Faiss for larger datasets (faiss-gpu)
Implement caching layer (Redis) for frequent queries
Add request queuing (Celery) for batch processing
Monitor with Prometheus + Grafana

🛠️ Development

Local Development Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run in development mode
uvicorn api.main:app --reload

# Run benchmarks
python benchmark.py

Code Quality

# Format code
black api/ generate_data.py benchmark.py

# Lint code
flake8 api/ generate_data.py benchmark.py

# Type checking
mypy api/

📚 References

📄 License

This project is provided as-is for educational and commercial use.

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Submit a pull request with tests

Built with ❤️ by the VectorWise Team

Last Updated: October 2025

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
api		api
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Dockerfile		Dockerfile
EXECUTION_GUIDE.md		EXECUTION_GUIDE.md
README.md		README.md
SUMMARY.md		SUMMARY.md
benchmark.py		benchmark.py
docker-compose.yml		docker-compose.yml
examples.py		examples.py
generate_data.py		generate_data.py
quickstart.sh		quickstart.sh
requirements.txt		requirements.txt
test_api.py		test_api.py

karsterr/VectorWise

Folders and files

Latest commit

History

Repository files navigation

🚀 VectorWise - Scalable ANN Search Engine

� System Overview

Architecture

Key Features

🏗️ Project Structure

🚀 Quick Start

Prerequisites

Step 1: Generate Data and Build Index

Step 2: Launch the Service

Step 3: Test the API

Step 4: Run Benchmarks

📊 API Documentation

Endpoints

GET /

POST /search

GET /stats

Interactive API Documentation

⚡ Performance Metrics

Latency Results

Recall Results

HNSW Parameter Optimization

Performance Trade-offs

🔧 Configuration

Adjusting HNSW Parameters

Docker Configuration

🧪 Testing

Unit Tests

Load Testing

📦 Docker Commands

🐛 Troubleshooting

Issue: Index file not found

Issue: Out of memory

Issue: Slow queries

Issue: Low recall

� Scaling Considerations

Horizontal Scaling

Index Updates

Production Recommendations

🛠️ Development

Local Development Setup

Code Quality

📚 References

📄 License

🤝 Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

`GET /`

`POST /search`

`GET /stats`