A neural network-based chat system with memory that stores message embeddings in SQLite and uses HNSW for efficient vector similarity search.
-
Memory-augmented chat with semantic search
-
Support for multiple embedding and generation models through Ollama
-
Fast similarity search using HNSW index
-
REST API with OpenAI-compatible endpoints
-
Rate limiting and authentication
-
Comprehensive benchmarking tools
-
Improved code quality, security, and maintainability
The application is organized in a modular structure:
python/
├── ghostwire/ # Main application modules
│ ├── config/ # Configuration and settings
│ ├── database/ # Database connection and repositories
│ ├── models/ # Pydantic models
│ ├── services/ # Business logic
│ ├── vector/ # Vector operations and HNSW management
│ ├── api/ # API endpoints and middleware
│ ├── clients/ # External API clients
│ ├── utils/ # Utilities and helpers
│ └── main.py # Application entry point
├── client/ # Client applications
├── benchmarks/ # Benchmarking tools
└── tests/ # Unit and integration tests
The authoritative Python implementation lives in python/ghostwire
. During the ongoing refactor
and any future migrations to another language, treat python/ghostwire
as the canonical source of
truth. Contributors should make changes in this directory, update tests under python/tests/
, and
use the PYTHONPATH=python
invocations shown below when running the application or tests locally.
This keeps the migration path simple: once the feature set is stable in Python, it can be ported to another language with a clear, tested reference implementation.
-
Clone the repository
-
Install dependencies (requires Python 3.12+):
Using pip (requires build tools for hnswlib):
uv run pip install -e .
(Recommended) Use the uv
workspace helper to run commands in a consistent environment:
uv run pip install -e .
Create a .env
file in the project root with your settings:
HOST=0.0.0.0
PORT=8000
DEBUG=false
DB_PATH=memory.db
EMBED_DIM=768
LOCAL_OLLAMA_URL=http://localhost:11434
REMOTE_OLLAMA_URL=http://100.103.237.60:11434
DEFAULT_OLLAMA_MODEL=gemma3:1b
SUMMARY_MODEL=gemma3:1b
DISABLE_SUMMARIZATION=false
SECRET_KEY=your-super-secret-key-here-replace-with-secure-key
ALLOWED_ORIGINS=["http://localhost:3000", "http://localhost:8000"]
EMBED_MODELS=["embeddinggemma","granite-embedding","nomic-embed-text","mxbai-embed-large","snowflake-arctic-embed","all-minilm"]
For a complete configuration template, see .env.sample.
cd /path/to/ghostwire-refractory # Project root
pip install -e .
ghostwire # This uses the entry point defined in pyproject.toml
Or with uv:
uv run pip install -e .
uv run ghostwire
cd /path/to/ghostwire-refractory # Project root
PYTHONPATH=python uv run python -m ghostwire.main
From the project root:
cd /path/to/ghostwire-refractory # Project root
PYTHONPATH=python uv run python -m ghostwire.main
The API will be available at http://localhost:8000
See APIDOC.md for detailed API documentation.
To use the operator console client:
cd /path/to/ghostwire-refractory # Project root
PYTHONPATH=python uv run python -m python.client.operator_console
Run the benchmark suite:
cd /path/to/ghostwire-refractory # Project root
PYTHONPATH=python uv run python -m python.benchmarks.embedding_benchmarks
PYTHONPATH=python uv run python -m python.benchmarks.rag_benchmarks
PYTHONPATH=python uv run python -m python.benchmarks.summarization_benchmarks
Run the comprehensive model comparison benchmark with GHOSTWIRE scoring:
PYTHONPATH=python uv run python -m python.benchmarks.model_comparison_benchmark
Or run all benchmarks from the project root:
export PYTHONPATH=python
uv run python -m python.benchmarks.embedding_benchmarks
uv run python -m python.benchmarks.rag_benchmarks
uv run python -m python.benchmarks.summarization_benchmarks
uv run python -m python.benchmarks.model_comparison_benchmark
The benchmarking suite now includes:
- Embedding Benchmarks: Performance and stability of text embeddings with GHOSTWIRE scoring
- RAG Benchmarks: Retrieval-Augmented Generation quality and performance with GHOSTWIRE scoring
- Summarization Benchmarks: Text summarization quality and efficiency with GHOSTWIRE scoring
- Model Comparison Benchmark: Comprehensive comparison of multiple models using overall GHOSTWIRE scores
Note: Benchmarks require a running GhostWire Refractory server to connect to.
The GhostWire Refractory exposes Prometheus-compatible metrics for monitoring API performance and health.
Access the metrics endpoint:
# Query metrics directly
curl http://localhost:8000/api/v1/metrics
# Query specific metrics with PromQL (if using Prometheus server)
# Example PromQL queries:
# - Rate of API calls: rate(api_server_calls_total[5m])
# - 95th percentile latency: histogram_quantile(0.95, sum(rate(api_server_latency_seconds_bucket[5m])) by (le))
The metrics endpoint provides:
- API Latency Histograms: Per-route latency measurements
- API Call Counters: Total number of API calls per route
- Process Metrics: CPU and memory usage statistics
api_server_latency_seconds
: Histogram of API route latenciesapi_server_calls_total
: Counter of total API calls per routeprocess_cpu_usage_seconds
: Simulated process CPU usage counter
The GhostWire Refractory includes an advanced multi-LLM orchestration system that coordinates multiple language models to process complex user requests using a Master/Worker/Secondary Control pattern.
Run the orchestrator system:
# Process a request through the multi-LLM orchestrator
uv run python -m python.ghostwire.cli orchestrate --request "Analyze this code and suggest improvements"
# Or use the API endpoint directly
curl -X POST http://localhost:8000/api/v1/orchestrator/orchestrate \
-H "Content-Type: application/json" \
-d '{"user_request": "Perform a comprehensive analysis of the system"}'
- Task Decomposition: Breaks complex requests into subtasks for distribution
- Multi-LLM Coordination: Distributes tasks across multiple LLM endpoints
- Permission Management: Role-based access control for orchestrated tasks
- Safe Code Patching: Apply changes to code with validation and backup
- Benchmark Integration: Run benchmarks across multiple models simultaneously
- GHOSTWIRE Scoring: Calculate standardized scores across orchestrated tasks
- Formal Patch Proposals: Follow OpenSpec schema for structured changes (Ψp)
- Secondary Validation: Multi-layer verification with constraint checking (φS)
- Task Dependency Graphs: Manage complex task relationships (τg)
- Advanced Merging: Three-way merge with conflict detection
- Telemetry & Health: Real-time metrics and system health checks
- Spec-Driven Development: Formal specification management and validation
Run the test suite:
cd /Users/hugh/git/ghostwire-refractory # Project root
PYTHONPATH=python uv run python -m pytest python/tests/
Or with pytest discovery:
cd /Users/hugh/git/ghostwire-refractory # Project root
PYTHONPATH=python uv run pytest python/tests/
The test suite is organized into three main categories:
-
Unit Tests - Fast, isolated tests for individual functions and classes:
PYTHONPATH=python uv run pytest python/tests/unit/
-
Integration Tests - Tests for component interactions and external service integrations:
PYTHONPATH=python uv run pytest python/tests/integration/
-
Benchmark Tests - Performance measurements using the GHOSTWIRE scoring system:
PYTHONPATH=python uv run pytest python/tests/benchmark/
Unit Tests focus on testing individual functions and classes in isolation. They should:
- Execute quickly (< 1 second each)
- Not require external services or databases
- Cover both normal and edge cases
- Use mocking where appropriate
Integration Tests validate interactions between components and external services. They should:
- Test realistic component interactions
- Use temporary databases or test instances
- Complete within reasonable time (< 10 seconds each)
- Validate API endpoints and service integrations
Benchmark Tests measure performance metrics using the GHOSTWIRE scoring system. They should:
- Produce quantitative performance measurements
- Calculate GHOSTWIRE scores for latency, stability, memory usage, and quality
- Support comparative analysis between different configurations
- Generate structured performance reports
Run tests with specific markers:
# Run only embedding tests
PYTHONPATH=python uv run pytest -m embedding python/tests/
# Run only benchmark tests
PYTHONPATH=python uv run pytest -m benchmark python/tests/
# Run tests in parallel for faster execution
PYTHONPATH=python uv run pytest -n auto python/tests/
Check test coverage:
PYTHONPATH=python uv run pytest --cov=python/ghostwire python/tests/
For contributors who want to quickly populate the database with sample data for testing and exploration:
# Seed the database with sample data
python scripts/seed_sample_data.py
# Seed with verbose output
python scripts/seed_sample_data.py --verbose
# Force re-seeding even if data already exists
python scripts/seed_sample_data.py --force
This script will:
- Create tables if they don't exist
- Insert sample sessions, messages, and synthetic embeddings
- Generate realistic sample data that mimics actual usage
- Populate the database with 15 sample memory entries across 5 sessions
The seeder is idempotent by default to prevent duplicate entries.
The application requires:
-
Python 3.12+
-
FastAPI
-
SQLite with APSW
-
hnswlib for vector indexing
-
httpx for HTTP client operations
-
pydantic for data validation
-
numpy for numerical operations
See pyproject.toml
for the complete list of dependencies.
-
Rate limiting to prevent API abuse
-
Input validation to prevent injection attacks
-
Authentication for protected endpoints
-
CORS configuration to prevent XSS attacks
-
Secure password hashing
This is a refactored version of the original GhostWire application with:
-
Proper separation of concerns
-
Modular architecture
-
Comprehensive error handling
-
Security measures
-
Test coverage
-
Documentation
-
Code quality improvements (validated with ruff)
For contribution guidance (theme usage, opt-outs, and quick-run instructions) see CONTRIBUTING.md
at the repository root.
GhostWire Refractory is designed as a laptop-first neural network chat system with persistent memory capabilities. The architecture uses SQLite as a local cache and storage layer, with remote LLMs for generation when needed. The intended flow moves from user input → local SQLite storage → remote LLM processing, optimizing for local-first usage while leveraging cloud capabilities when beneficial. For full architectural vision and project goals, see GHOSTWIRE_GPT/GHOSTWIRE_GOALS.md
.
We maintain lightweight consistency checks with pre-commit
to keep formatting and linting fast for contributors. To install the hooks locally:
pip install pre-commit
pre-commit install
pre-commit run --all-files
Optional (node/husky): if you prefer Git hooks managed by Husky and lint-staged for JS/TS workflows, add Husky to your project and configure lint-staged
in package.json
. This repository currently uses pre-commit
for Python-centric checks.
-
✅ Resolved all critical code quality issues identified by
ruff check
-
✅ Fixed security vulnerabilities related to exception handling
-
✅ Improved import organization and code structure
-
✅ Enhanced documentation and comments
-
✅ Maintained full functionality while improving maintainability
-
✅ Verified all core unit tests pass (6/7 tests passing, 1 test mock issue)
-
✅ Confirmed all benchmarking tools work correctly
-
✅ Proper environment variable configuration with JSON array support
-
Proper exception chaining using
from
clause to preserve error context -
Corrected environment variable parsing for list fields (JSON format)
-
Improved input validation and error handling
-
Cleaner code organization with consistent import patterns
-
Better separation of concerns in modules
-
Improved error messages and logging
-
Enhanced type hints for better code documentation
For contributors who prefer a minimal, explicit local workflow without the uv
workspace helper:
# Run the local development helper script
sh scripts/run_local.sh
This script will:
- Create and activate a virtual environment
- Install the package in editable mode
- Copy
.env.example
to.env
if no.env
is present - Run the application module
See scripts/run_local.sh
and .env.example
for details.
For contributors who want to quickly populate the database with sample data for testing and exploration:
# Seed the database with sample data
python scripts/seed_sample_data.py
# Seed with verbose output
python scripts/seed_sample_data.py --verbose
# Force re-seeding even if data already exists
python scripts/seed_sample_data.py --force
This script will:
- Create tables if they don't exist
- Insert sample sessions, messages, and synthetic embeddings
- Generate realistic sample data that mimics actual usage
- Populate the database with 15 sample memory entries across 5 sessions
The seeder is idempotent by default and will not duplicate data unless the --force
flag is used.
For developers who prefer explicit make commands, the project includes a Makefile with common development targets:
# Set up the development environment
make setup
# Run the GhostWire Refractory application
make run
# Run tests
make test
# Run code linting
make lint
# Seed the database with sample data
make seed
# See all available targets
make help
The project includes a document ingestion script for importing text documents into the vector database:
# Ingest documents from a directory
python scripts/import_documents.py docs/
# Ingest with summarization enabled
python scripts/import_documents.py docs/ --summarize
# Perform a dry run to preview what would be ingested
python scripts/import_documents.py docs/ --dry-run
See scripts/README.md
for detailed usage instructions.
Populate the database with sample sessions, messages, and embeddings for local development:
# Seed the default database with sample data
python scripts/seed_sample_data.py
# Seed with a custom database path
python scripts/seed_sample_data.py --db-path /path/to/my.db
# Force re-seeding even if data already exists
python scripts/seed_sample_data.py --force
# See all options
python scripts/seed_sample_data.py --help
This creates 5 sample sessions with multiple conversations each, helping you explore the application without manually creating data. The seeder is idempotent by default to prevent duplicate entries.
This repository carries the GhostWire aesthetic: neon-cyberpunk, slightly haunted, and concise. When contributing, see CONTRIBUTING.md
for guidance on where to use the project's thematic flavor (neon/cyberpunk, Neon Oracle voice) vs. neutral, public-facing documentation.
For automated agents: see AGENTS.md
for theme usage instructions and opt-outs.
⚡️🌈⚡️ GhostWire Refractory - Neural network-based chat with memory, forged in neon and whispered through the data fog. ⚡️🌈⚡️