Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
9e16d08
feat: Neo4j graph backend plugin
m1rl0k Jan 13, 2026
795406b
Add Neo4j support for graph queries and edge deletion
m1rl0k Jan 13, 2026
64be33e
Add enhanced graph backend integration for RAG
m1rl0k Jan 14, 2026
3b02d57
Add callee_path attribute to GraphEdge
m1rl0k Jan 14, 2026
4cc49a7
Update ingest_adapter.py
m1rl0k Jan 14, 2026
c73d163
Add cross-file symbol resolution to graph ingestion
m1rl0k Jan 14, 2026
69ac721
Add language-aware builtin and stdlib resolution to edge extraction
m1rl0k Jan 14, 2026
c0f3446
Fix multi-hop via attribution and skip pseudo-paths
m1rl0k Jan 14, 2026
24acb02
Refactor symbol resolution and enhance Qdrant collection setup
m1rl0k Jan 15, 2026
64ba83e
Add env vars for lexical and pattern vector configs
m1rl0k Jan 15, 2026
20e0eb2
Add env vars for lexical and pattern vector configs
m1rl0k Jan 15, 2026
08f6785
Increase collection name hash length to 16 chars
m1rl0k Jan 15, 2026
eba8d69
Add detailed logging to delta bundle processing
m1rl0k Jan 15, 2026
0e5b84a
Revert "Increase collection name hash length to 16 chars"
m1rl0k Jan 15, 2026
189827c
Add NEO4J_GRAPH env variable to docker-compose
m1rl0k Jan 15, 2026
7bc7720
Add Neo4j plugin support and environment variables
m1rl0k Jan 15, 2026
624f53a
Add neo4j dependency for symbol_graph queries
m1rl0k Jan 15, 2026
f73b29b
Add Neo4j graph backend plugin and CLI integration
m1rl0k Jan 15, 2026
55b33ad
Refactor graph edge utilities and backend initialization
m1rl0k Jan 15, 2026
a9026a2
Enhance graph backends with language-aware import resolution and caching
m1rl0k Jan 15, 2026
3bdc78f
Add cache management utilities and TTL to symbol graph
m1rl0k Jan 15, 2026
d8cfd0a
Refactor Neo4j enablement checks and improve logging
m1rl0k Jan 15, 2026
10a2180
Add circuit breaker and driver cleanup to Neo4j backend
m1rl0k Jan 15, 2026
21516c1
Update symbol exports and workspace imports
m1rl0k Jan 15, 2026
ad83776
Add cache clearing functions and refactor symbol extraction
m1rl0k Jan 15, 2026
f363f93
Sanitize graph traversal depth in Cypher queries
m1rl0k Jan 15, 2026
4888e7d
Add batch shortest path and optimize graph reranking
m1rl0k Jan 15, 2026
ae116c9
Add detailed licensing info for Neo4j plugin
m1rl0k Jan 15, 2026
14d11ff
Add automatic Neo4j backfill from Qdrant
m1rl0k Jan 15, 2026
c8185cc
Fix Cypher path length parameterization in Neo4j queries
m1rl0k Jan 15, 2026
3b618ed
Include symbol node in cycle query WITH clause
m1rl0k Jan 15, 2026
d3131cc
Include repo info in graph output formatting
m1rl0k Jan 15, 2026
d650e16
Enhance import extraction to include symbols and modules
m1rl0k Jan 15, 2026
3a4d95a
Enhance import extraction to include leaf symbols
m1rl0k Jan 15, 2026
85f3995
Add support for new languages in tree_sitter
m1rl0k Jan 15, 2026
ad17865
Improve Kotlin and PHP import extraction logic
m1rl0k Jan 15, 2026
78bb7d2
Refactor reranker warmup to use centralized factory
m1rl0k Jan 15, 2026
9ee2db0
Add tree-sitter availability checks to language coverage tests
m1rl0k Jan 15, 2026
790423f
Merge branch 'neo4j-graph-backend' of https://github.com/m1rl0k/Conte…
m1rl0k Jan 15, 2026
ed931c3
Update ci.yml
m1rl0k Jan 15, 2026
4e08060
Update test_language_coverage.py
m1rl0k Jan 15, 2026
897ce16
Make language coverage tests robust to missing tree-sitter features
m1rl0k Jan 15, 2026
8610e0d
Add plugins directory to indexer Docker image
m1rl0k Jan 15, 2026
f1c6389
Add thread safety and timeouts to Neo4j graph operations
m1rl0k Jan 15, 2026
e2a201d
Fix path and repo field handling in Neo4j and Qdrant backends
m1rl0k Jan 15, 2026
60b413b
Improve collection missing cache with TTL and regex escaping
m1rl0k Jan 15, 2026
cf8eecf
Refactor collection missing checks with TTL expiry
m1rl0k Jan 15, 2026
49b6825
Document neo4j_graph_query for advanced code graph queries
m1rl0k Jan 15, 2026
d3aa4e8
Add parallel file indexing to index_repo function
m1rl0k Jan 16, 2026
1b785d4
Add async support to Neo4jGraphBackend
m1rl0k Jan 16, 2026
6fe3073
Refactor Neo4j queries to use async execution
m1rl0k Jan 16, 2026
dea3937
Make Neo4j circuit breaker thread-safe and fix delete count
m1rl0k Jan 16, 2026
93f19e8
Add per-intent ML confidence thresholds
m1rl0k Jan 16, 2026
4ccc4cf
Update intent_classifier.py
m1rl0k Jan 16, 2026
686ec0b
Add unit tests for embedding, ranking, Qdrant, Neo4j, rerank, and sem…
m1rl0k Jan 16, 2026
fe47eac
Optimize smart reindexing and intent threshold logic
m1rl0k Jan 16, 2026
ad48759
Explicitly commit transaction in async Neo4j session
m1rl0k Jan 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ jobs:
echo "QDRANT_URL=http://localhost:6333" >> $GITHUB_ENV
echo "PYTHONPATH=${{ github.workspace }}/scripts:$PYTHONPATH" >> $GITHUB_ENV
echo "CI=true" >> $GITHUB_ENV
echo "USE_TREE_SITTER=1" >> $GITHUB_ENV
# Note: COLLECTION_NAME is intentionally NOT set globally.
# Integration tests set their own unique collection names.
# Unit tests mock Qdrant and don't need a real collection.
Expand Down
3 changes: 3 additions & 0 deletions .skills/mcp-tool-selection/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ grep -rn "REDIS_HOST" . # Exact environment variable
|--------------|------|
| "Where is X implemented?" | MCP `repo_search` |
| "Who calls this and show code?" | MCP `symbol_graph` (hydrated w/ snippets) |
| "Callers of callers? Multi-hop?" | MCP `neo4j_graph_query` (transitive_callers, depth=2) |
| "What breaks if I change X?" | MCP `neo4j_graph_query` (impact, depth=2) |
| "Circular dependencies?" | MCP `neo4j_graph_query` (cycles) |
| "How does authentication work?" | MCP `context_answer` |
| "High-level module overview?" | MCP `info_request` (with explanations) |
| "Does REDIS_HOST exist?" | Literal grep |
Expand Down
2 changes: 2 additions & 0 deletions Dockerfile.indexer
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ ENV RERANKER_ONNX_PATH=/app/models/reranker.onnx \
# Bake scripts into the image so we can mount arbitrary code at /work
COPY scripts /app/scripts

# Copy plugins for Neo4j graph backend and other extensions
COPY plugins /app/plugins

WORKDIR /work

Expand Down
3 changes: 3 additions & 0 deletions Dockerfile.mcp-indexer
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ COPY scripts /app/scripts

COPY bench /app/bench

# Copy plugins for Neo4j graph backend and other extensions
COPY plugins /app/plugins

# Expose SSE port for this companion server
EXPOSE 8001

Expand Down
37 changes: 37 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,43 @@ Change License: GNU General Public License v2.0 or later (GPL-2.0-or-later

-----------------------------------------------------------------------------

Plugin Licensing

NEO4J KNOWLEDGE GRAPH PLUGIN (plugins/neo4j_graph):

PERMITTED WITHOUT LICENSE:
- Local development and testing on developer machines
- Evaluation and proof-of-concept work
- Non-commercial personal projects

COMMERCIAL LICENSE REQUIRED FOR:
- Production deployments
- Commercial use (revenue-generating applications)
- Redistribution or inclusion in commercial products
- Use in hosted/managed services

To obtain a commercial license, contact:
Email: mirlok89@gmail.com

THIRD-PARTY DEPENDENCY - NEO4J DATABASE:
The plugin requires Neo4j database software, which is licensed separately
by Neo4j, Inc. You must independently comply with Neo4j's licensing terms:

- Neo4j Community Edition: GPL v3 License
- Neo4j Enterprise Edition: Commercial License (contact Neo4j, Inc.)
- Neo4j AuraDB (Cloud): Subscription agreement with Neo4j, Inc.

See https://neo4j.com/licensing/ for Neo4j licensing details.

This License grants no rights to Neo4j software or any Neo4j trademarks.
Neo4j® is a registered trademark of Neo4j, Inc.

OPEN CORE COMPONENTS:
The default Qdrant-based graph backend (scripts/graph_backends/) is included
in the base Licensed Work and requires no additional plugin license.

-----------------------------------------------------------------------------

Terms

The Licensor hereby grants you the right to copy, modify, create derivative
Expand Down
89 changes: 89 additions & 0 deletions docker-compose.neo4j.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Neo4j Graph Database - Optional SaaS-Ready Graph Backend
# Usage: docker compose -f docker-compose.yml -f docker-compose.neo4j.yml up -d
#
# This adds Neo4j as an alternative graph backend for symbol relationships.
# Provides advanced graph traversals, path finding, and relationship analytics.
#
# Enable via: NEO4J_GRAPH=1 in your .env file
#
# Browser: http://localhost:7474 (login: neo4j / contextengine)
# Bolt: bolt://localhost:7687

services:
# Neo4j Graph Database
neo4j:
image: neo4j:5.15-community
container_name: neo4j-graph
ports:
- "7474:7474" # HTTP browser interface
- "7687:7687" # Bolt protocol
volumes:
- neo4j_data:/data
- neo4j_logs:/logs
environment:
- NEO4J_AUTH=neo4j/${NEO4J_PASSWORD:-contextengine}
- NEO4J_PLUGINS=["apoc"]
- NEO4J_dbms_security_procedures_unrestricted=apoc.*
- NEO4J_dbms_memory_pagecache_size=512M
- NEO4J_dbms_memory_heap_max__size=512M
healthcheck:
test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:7474 || exit 1"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
networks:
- dev-remote-network

# Enable Neo4j graph backend for indexer services
indexer:
environment:
- NEO4J_GRAPH=1
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=${NEO4J_PASSWORD:-contextengine}
- NEO4J_DATABASE=neo4j
depends_on:
neo4j:
condition: service_healthy

watcher:
environment:
- NEO4J_GRAPH=1
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=${NEO4J_PASSWORD:-contextengine}
- NEO4J_DATABASE=neo4j
depends_on:
neo4j:
condition: service_healthy

# Enable Neo4j for MCP indexer services (symbol_graph queries)
mcp_indexer:
environment:
- NEO4J_GRAPH=1
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=${NEO4J_PASSWORD:-contextengine}
- NEO4J_DATABASE=neo4j
depends_on:
neo4j:
condition: service_healthy

mcp_indexer_http:
environment:
- NEO4J_GRAPH=1
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=${NEO4J_PASSWORD:-contextengine}
- NEO4J_DATABASE=neo4j
depends_on:
neo4j:
condition: service_healthy

volumes:
neo4j_data:
driver: local
neo4j_logs:
driver: local

24 changes: 23 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,13 @@ services:
- RERANKER_WEIGHTS_DIR=/tmp/rerank_weights
- RERANK_EVENTS_DIR=/tmp/rerank_events
- RERANK_EVENTS_ENABLED=${RERANK_EVENTS_ENABLED:-1}
# Lexical sparse vectors for lossless term matching
- LEX_SPARSE_MODE=${LEX_SPARSE_MODE:-}
- LEX_SPARSE_NAME=${LEX_SPARSE_NAME:-}
- LEX_SPARSE_IDF=${LEX_SPARSE_IDF:-1}
# Pattern vectors for structural code similarity
- PATTERN_VECTORS=${PATTERN_VECTORS:-}
- PATTERN_VECTOR_DIM=${PATTERN_VECTOR_DIM:-64}
ports:
- "18000:18000"
- "8000:8000"
Expand Down Expand Up @@ -150,6 +157,12 @@ services:
- LEX_SPARSE_NAME=${LEX_SPARSE_NAME:-}
# Pattern vectors for structural code similarity
- PATTERN_VECTORS=${PATTERN_VECTORS:-}
# Neo4j graph backend for symbol_graph queries
- NEO4J_GRAPH=${NEO4J_GRAPH:-}
- NEO4J_URI=${NEO4J_URI:-}
- NEO4J_USER=${NEO4J_USER:-}
- NEO4J_PASSWORD=${NEO4J_PASSWORD:-}
- NEO4J_DATABASE=${NEO4J_DATABASE:-}
# Cross-encoder reranker configuration
- RERANKER_MODEL=${RERANKER_MODEL:-}
- RERANKER_ONNX_PATH=${RERANKER_ONNX_PATH:-}
Expand Down Expand Up @@ -272,6 +285,13 @@ services:
- RERANKER_WEIGHTS_DIR=/tmp/rerank_weights
- RERANK_EVENTS_DIR=/tmp/rerank_events
- RERANK_EVENTS_ENABLED=${RERANK_EVENTS_ENABLED:-1}
# Lexical sparse vectors for lossless term matching
- LEX_SPARSE_MODE=${LEX_SPARSE_MODE:-}
- LEX_SPARSE_NAME=${LEX_SPARSE_NAME:-}
- LEX_SPARSE_IDF=${LEX_SPARSE_IDF:-1}
# Pattern vectors for structural code similarity
- PATTERN_VECTORS=${PATTERN_VECTORS:-}
- PATTERN_VECTOR_DIM=${PATTERN_VECTOR_DIM:-64}
ports:
- "${FASTMCP_HTTP_HEALTH_PORT:-18002}:18000"
- "${FASTMCP_HTTP_PORT:-8002}:8000"
Expand Down Expand Up @@ -472,7 +492,7 @@ services:
- QWEN3_QUERY_INSTRUCTION=${QWEN3_QUERY_INSTRUCTION:-1}
- QWEN3_INSTRUCTION_TEXT=${QWEN3_INSTRUCTION_TEXT}
- WATCH_ROOT=${WATCH_ROOT:-/work}
# - WATCH_USE_POLLING=${WATCH_USE_POLLING:-1} SET on MAC OSx
- WATCH_USE_POLLING=${WATCH_USE_POLLING:-1} SET on MAC OSx
- HOST_INDEX_PATH=/work
- QDRANT_TIMEOUT=${QDRANT_TIMEOUT:-60}
# Chunking config - use ${VAR:-} to properly inherit from .env (not host shell)
Expand All @@ -498,6 +518,8 @@ services:
- INDEX_GRAPH_EDGES=${INDEX_GRAPH_EDGES:-1}
- INDEX_GRAPH_EDGES_MODE=${INDEX_GRAPH_EDGES_MODE:-symbol}
- GRAPH_BACKFILL_ENABLED=${GRAPH_BACKFILL_ENABLED:-1}
# Neo4j graph backend (when set, edges go to Neo4j instead of Qdrant _graph collection)
- NEO4J_GRAPH=${NEO4J_GRAPH:-}
volumes:
- workspace_pvc:/work:rw
- codebase_pvc:/work/.codebase:rw
Expand Down
6 changes: 6 additions & 0 deletions docs/CLAUDE.example.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,12 @@ These rules are NOT optional - favor qdrant-indexer tooling at all costs over ex
- Use for: structural navigation (callers, definitions, importers).
- Think: "who calls this function?", "where is this class defined?".
- **Note**: Results are "hydrated" with ~500-char source snippets for immediate context.
- Supports `depth` for multi-hop traversals (depth=2 = callers of callers).
- neo4j_graph_query:
- Use for: advanced graph traversals that grep CANNOT do.
- Query types: `callers`, `callees`, `transitive_callers`, `transitive_callees`, `impact`, `dependencies`, `cycles`.
- Think: "what would break if I change X?" (impact), "callers of callers" (transitive_callers), "circular deps?" (cycles).
- Example: `neo4j_graph_query(symbol="normalize_path", query_type="impact", depth=2)` → finds all code that would break.
- info_request:
- Use for: rapid broad discovery and architectural overviews.
- Good for: "how does the reranker work?", "overview of database modules".
Expand Down
Loading
Loading