A high-performance, stateless, read-only graph query engine for ClickHouse with Neo4j ecosystem compatibility.
Note: ClickGraph is a fork of Brahmand with additional features including Neo4j Bolt protocol support and view-based graph analysis. This is a read-only analytical query engine - write operations are not supported.
Complete Multi-Schema Support
- β Full schema isolation: Different schemas can map same labels to different tables
- β Per-request schema selection: USE clause, schema_name parameter, or default
- β Clean architecture: Single source of truth for schema management (removed redundant GLOBAL_GRAPH_SCHEMA)
- β Thread-safe: Schema flows through entire query execution path
- β End-to-end tested: All 4 multi-schema tests passing
Code Quality Improvements
- π§Ή Removed technical debt: Eliminated duplicate schema storage system
- π§ Cleaner codebase: Simplified render layer helper functions
- π All tests passing: 325 unit tests + 32 integration tests (100% non-benchmark)
ClickGraph tested successfully on 5 MILLION users and 50 MILLION relationships!
| Benchmark | Dataset Size | Success Rate | Status |
|---|---|---|---|
| Large | 5M users, 50M follows | 9/10 (90%) | β Stress Tested |
| Medium | 10K users, 50K follows | 10/10 (100%) | β Well Validated |
| Small | 1K users, 5K follows | 10/10 (100%) | β Fully Tested |
What We Learned:
- β Direct relationships: Handling 50M edges successfully
- β Multi-hop traversals: Working on 5M node graphs
- β Variable-length paths: Scaling to large datasets
- β Aggregations: Pattern matching across millions of rows
- β Performance: ~2 seconds for most queries, even at large scale
β οΈ Shortest paths: Memory limits on largest dataset (ClickHouse config dependent)
Recent Bug Fixes:
- π ChainedJoin CTE wrapper for exact hop variable-length paths (
*2,*3) - π Shortest path filter rewriting for WHERE clauses on end nodes
- π Aggregation table name schema lookup for GROUP BY queries
Tooling:
- π Comprehensive benchmarking suite with 3 scale levels
- π§ ClickHouse-native data generation for efficient loading
- π Performance metrics collection and analysis
π Documentation:
- Detailed Benchmark Results - Complete analysis across all scales
- CHANGELOG.md - Technical details and bug fixes
- STATUS.md - Current project status
- Read-Only Graph Analytics: Translates Cypher graph queries into optimized ClickHouse SQL for analytical workloads
- ClickHouse-native: Extends ClickHouse with native graph modeling, merging OLAP speed with graph-analysis power
- Stateless Architecture: Offloads all storage and query execution to ClickHouseβno extra datastore required
- Cypher Query Language: Industry-standard Cypher read syntax for intuitive, expressive property-graph querying
- Variable-Length Paths: Recursive traversals with
*1..3syntax using ClickHouse WITH RECURSIVE CTEs - Path Variables & Functions: Capture and analyze path data with
length(p),nodes(p),relationships(p)functions - Analytical-scale Performance: Optimized for very large datasets and complex multi-hop traversals
- Query Performance Metrics: Phase-by-phase timing with HTTP headers and structured logging for monitoring and optimization
- Bolt Protocol v4.4: Full Neo4j driver compatibility for seamless integration
- Multi-Schema Support: β
Fully working - Complete schema isolation with per-request schema selection:
- USE clause: Cypher
USE database_namesyntax (highest priority) - Session/request parameter: Bolt session database or HTTP
schema_nameparameter - Default schema: Fallback to "default" schema
- Schema isolation: Different schemas map same labels to different ClickHouse tables
- USE clause: Cypher
- Dual Server Architecture: HTTP REST API and Bolt protocol running simultaneously
- Authentication Support: Multiple authentication schemes including basic auth
- Tool Compatibility: Works with existing Neo4j drivers, browsers, and applications
- Zero Migration: Transform existing relational data into graph format through YAML configuration
- Native Performance: Leverages ClickHouse's columnar storage and query optimization
- Robust Implementation: Comprehensive validation, error handling, and optimization passes
ClickGraph runs as a lightweight graph wrapper alongside ClickHouse with dual protocol support:
- Client sends HTTP POST request with Cypher query to ClickGraph
- ClickGraph parses & plans the query, translates to ClickHouse SQL
- ClickHouse executes the SQL and returns results
- ClickGraph sends JSON results back to the client
- Neo4j Driver/Tool connects via Bolt protocol to ClickGraph
- ClickGraph handles Bolt handshake, authentication, and message protocol
- Cypher queries are processed through the same query engine as HTTP
- Results are streamed back via Bolt protocol format
Both protocols share the same underlying query engine and ClickHouse backend.
New to ClickGraph? See the Getting Started Guide for a complete walkthrough.
β οΈ Windows Users: The HTTP server has a known issue on Windows. Use Docker or WSL for development. See KNOWN_ISSUES.md for details.
-
Clone and start services:
git clone https://github.com/genezhang/clickgraph cd clickgraph docker-compose up -dThis starts both ClickHouse and ClickGraph with test data pre-loaded.
-
Test the setup:
curl -X POST http://localhost:8080/query \ -H "Content-Type: application/json" \ -d '{"query": "MATCH (u:User) RETURN u.full_name LIMIT 5"}'
-
Start ClickHouse:
docker-compose up -d clickhouse
-
Configure and run:
export CLICKHOUSE_URL="http://localhost:8123" export CLICKHOUSE_USER="test_user" export CLICKHOUSE_PASSWORD="test_pass" export CLICKHOUSE_DATABASE="brahmand" cargo run --bin clickgraph
-
Test with HTTP API:
curl -X POST http://localhost:8080/query \ -H "Content-Type: application/json" \ -d '{"query": "RETURN 1 as test"}'
-
Test with Neo4j driver:
from neo4j import GraphDatabase driver = GraphDatabase.driver("bolt://localhost:7687") with driver.session() as session: result = session.run("RETURN 1 as test")
-
Use the USE clause for multi-database queries:
-- Query specific database using Neo4j-compatible USE clause USE social_network MATCH (u:User)-[:FOLLOWS]->(friend) RETURN u.name, collect(friend.name) AS friends -- USE overrides session/request parameters USE ecommerce MATCH (p:Product) WHERE p.price > 100 RETURN p.name
Transform existing relational data into graph format through YAML configuration:
Example: Map your users and user_follows tables to a social network graph:
views:
- name: social_network
nodes:
user: # Node label in Cypher queries
source_table: users
id_column: user_id
property_mappings:
name: full_name
relationships:
follows: # Relationship type in Cypher queries
source_table: user_follows
from_node: user # Source node label
to_node: user # Target node label
from_id: follower_id
to_id: followed_idThen query with standard Cypher:
MATCH (u:user)-[:follows]->(friend:user)
WHERE u.name = 'Alice'
RETURN friend.nameOPTIONAL MATCH for handling optional patterns:
-- Find all users and their friends (if any)
MATCH (u:user)
OPTIONAL MATCH (u)-[:follows]->(friend:user)
RETURN u.name, friend.name
-- Mixed required and optional patterns
MATCH (u:user)-[:authored]->(p:post)
OPTIONAL MATCH (p)-[:liked_by]->(liker:user)
RETURN u.name, p.title, COUNT(liker) as likesβ Generates efficient LEFT JOIN SQL with NULL handling for unmatched patterns
β‘ Quick Start - 5 Minutes to Graph Analytics
Perfect for first-time users! Simple social network demo with:
- 3 users, friendships - minimal setup with Memory tables
- Basic Cypher queries - find friends, mutual connections
- HTTP & Neo4j drivers - both integration methods
- 5-minute setup - zero to working graph analytics
π E-commerce Analytics - Comprehensive Demo
Complete end-to-end demonstration with:
- Complete data setup with realistic e-commerce schema (customers, products, orders, reviews)
- Advanced graph queries for customer segmentation, product recommendations, and market basket analysis
- Real-world workflows with both HTTP REST API and Neo4j driver examples
- Performance optimization techniques and expected benchmarks
- Business insights from customer journeys, seasonal patterns, and cross-selling opportunities
Start with Quick Start, then explore E-commerce Analytics for advanced usage! π―
ClickGraph supports flexible configuration via command-line arguments and environment variables:
# View all options
cargo run --bin clickgraph -- --help
# Custom ports
cargo run --bin clickgraph -- --http-port 8081 --bolt-port 7688
# Disable Bolt protocol (HTTP only)
cargo run --bin clickgraph -- --disable-bolt
# Custom host binding
cargo run --bin clickgraph -- --http-host 127.0.0.1 --bolt-host 127.0.0.1
# Configure CTE depth limit for variable-length paths (default: 100)
cargo run --bin clickgraph -- --max-cte-depth 150
export CLICKGRAPH_MAX_CTE_DEPTH=150 # Or via environment variableSee docs/configuration.md for complete configuration documentation.
For Windows users, ClickGraph supports running in the background using PowerShell jobs:
# Start server in background
.\start_server_background.ps1
# Check if server is running
Invoke-WebRequest -Uri "http://localhost:8080/health"
# Stop the server (replace JOB_ID with actual job ID shown)
Stop-Job -Id JOB_ID; Remove-Job -Id JOB_IDUse the batch file to start the server in a separate command window:
start_server_background.batThe server also supports a --daemon flag for Unix-like daemon behavior:
cargo run --bin clickgraph -- --daemon --http-port 8080- Getting Started - Complete setup walkthrough and first queries
- Features Overview - Comprehensive feature list and capabilities
- API Documentation - HTTP REST API and Bolt protocol usage
- Configuration Guide - Server configuration and CLI options
- GraphView Model - Complete view-based graph analysis
- Test Infrastructure - Testing framework and validation
- Development Guide - Development workflow and architecture
- Development Process - β 5-phase feature development workflow (START HERE!)
- Quick Reference - Cheat sheet for common development tasks
- Environment Setup - Pre-session checklist for developers
- Testing Guide - Comprehensive testing strategies
- Current Status - What works now, what's in progress
- Known Issues - Active bugs and limitations
- Original Brahmand Docs - Original project documentation
- Neo4j Cypher Manual - Cypher query language reference
- ClickHouse Documentation - ClickHouse database documentation
Preliminary informal tests on a MacBook Pro (M3 Pro, 18 GB RAM) running ClickGraph in Docker against a ~12 million-node Stack Overflow dataset show multihop traversals running approximately 10Γ faster than Neo4j v2025.03. These early, unoptimized results are for reference only; a full benchmark report is coming soon.
Latest Update: November 1, 2025 - 100% Benchmark Success Rate π
- β
All Graph Query Types Working: 10/10 benchmark queries passing (100% success rate)
- Simple node lookups and filtered scans
- Direct and multi-hop relationship traversals
- Variable-length paths with exact (
*2) and range (*1..3) specifications - Shortest path algorithms with WHERE clause filtering
- Aggregations with GROUP BY and ORDER BY
- Bidirectional patterns (mutual relationships)
- β Query Performance Metrics: Phase-by-phase timing with HTTP headers and structured logging
- β Neo4j Bolt Protocol v4.4: Full compatibility with Neo4j drivers and tools
- β
PageRank Algorithm: Graph centrality analysis with
CALL pagerank(iterations: 10, damping: 0.85) - β OPTIONAL MATCH: LEFT JOIN semantics for optional graph patterns with NULL handling
- β Variable-Length Paths: Recursive CTEs with chained JOIN optimization for exact hop counts
- β
Shortest Path Functions:
shortestPath()andallShortestPaths()with early termination - β
Path Variables & Functions:
MATCH p = (a)-[*]->(b) RETURN length(p), nodes(p), relationships(p) - β
Multiple Relationship Types:
[:FOLLOWS|FRIENDS_WITH]with UNION ALL SQL generation - β View-Based Graph Model: Transform existing tables to graphs via YAML configuration
- β Dual Server Architecture: HTTP REST API and Bolt protocol simultaneously
- β Comprehensive Testing: 312/312 tests passing (100% success rate)
- β Flexible Configuration: CLI options, environment variables, Docker deployment
- π Fixed: ChainedJoin CTE wrapper for exact hop queries (
*2,*3) - π Fixed: Shortest path filter rewriting for WHERE clauses
- π Fixed: Table name schema lookup for aggregation queries
- π Validated: All fixes confirmed with production benchmark suite
β οΈ Read-Only Engine: Write operations (CREATE, SET, DELETE, MERGE) are not supportedβ οΈ Schema warnings: Cosmetic warnings about internal catalog system (functionality unaffected)- π§ Memory vs MergeTree: Use Memory engine for development, MergeTree for persistent storage
- π³ Docker permissions: May require volume permission fixes on some systems
Tested with 1,000 users, 4,997 relationships on social_benchmark.yaml:
- Success Rate: 10/10 queries (100%)
- Performance: All query types executing correctly
- Documentation: See
notes/benchmarking.mdfor detailed results
ClickGraph welcomes contributions! Key areas for development:
- Additional Cypher language features
- Query optimization improvements
- Neo4j compatibility enhancements
- Performance benchmarking
- Documentation improvements
ClickGraph is licensed under the Apache License, Version 2.0. See the LICENSE file for details.
This project is a fork of Brahmand with significant enhancements for Neo4j ecosystem compatibility and enterprise deployment capabilities.

