Skip to content

Multi granular#184

Merged
m1rl0k merged 12 commits intotestfrom
multi-granular
Jan 18, 2026
Merged

Multi granular#184
m1rl0k merged 12 commits intotestfrom
multi-granular

Conversation

@m1rl0k
Copy link
Collaborator

@m1rl0k m1rl0k commented Jan 18, 2026

No description provided.

Implements multi-granular embeddings (entity_dense and relation_dense) for improved code retrieval, including two-stage prefetch search in Qdrant, fusion logic in ranking, and ingestion pipeline changes to generate and store new vectors. Updates configuration and collection management to support the new vector types, enabling more precise and graph-aware retrieval as outlined in the project plan.
Introduces adaptive weighting and fusion for entity and relation dense vectors in hybrid search, based on query characteristics. Updates ranking logic to detect graph-oriented queries and adjust weights accordingly. Enables multi-granular vector search and fusion in hybrid_search.py, and aligns relation dense vector dimension with embedding model output in config.py.
@augmentcode
Copy link

augmentcode bot commented Jan 18, 2026

🤖 Augment PR Summary

Summary: Adds optional multi-granular retrieval (entity + relation embeddings) and improves graph/CLI ergonomics to better handle symbol- and dependency-oriented queries.

Changes:

  • Adds multi-granular vector configuration (MULTI_GRANULAR_VECTORS, ENTITY_DENSE_*, RELATION_DENSE_*) and extends Qdrant collection setup to create/update these vector slots.
  • Extends ingest pipeline to generate “entity” (signature/doc) and “relation” (calls/imports) texts, embed them, and upsert as additional vectors when supported by the collection.
  • Adds a two-stage Qdrant query path using query_points prefetch and fuses entity/relation results into hybrid scoring via RRF (fuse_multi_granular_scores).
  • Enhances adaptive weighting to include graph intent detection and dynamic entity/relation weights (HYBRID_ENTITY_DENSE_WEIGHT, HYBRID_RELATION_DENSE_WEIGHT).
  • Neo4j graph: enriches Symbol nodes on edge upsert, adds a simple in-degree-based “pagerank” approximation, and uses COALESCE in queries to avoid warnings from missing optional properties.
  • CLI: improves reset/start orchestration (optional Neo4j compose, conditional llamacpp, cache clearing, detached indexer), normalizes host Qdrant URL usage, and makes ctx status work from any directory via docker ps label filtering.

Technical Notes: Multi-granular querying falls back to standard dense search if prefetch isn’t supported (older Qdrant) or vectors aren’t available; ingest gates vector writes on the collection’s advertised vector names to avoid upsert errors.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Introduced get_qdrant_url_for_host to ensure the CLI uses localhost instead of Docker's internal hostname for Qdrant connections. Updated graph and reset commands to use this helper. Enhanced the reset command to clear local and dev-workspace cache files and symbols directories before running the indexer.
The reset command now stops services and removes volumes with 'docker-compose down -v --remove-orphans'. The indexer is started in detached mode to avoid blocking the CLI, and users are informed how to monitor its logs.
Replaces 'docker compose ps' with 'docker ps' using label filters to allow status checks from any directory, not just where docker-compose.yml is present. Adapts output parsing to maintain compatibility with previous format.
Refactors hybrid_search.py to apply RRF fusion per-query for multi-granular vectors, ensuring correct ranking semantics. Updates pipeline.py to only enable multi-granular vector indexing if the collection supports the required vectors, and adds logic to batch embed and store entity and relation vectors during ingestion.
@m1rl0k
Copy link
Collaborator Author

m1rl0k commented Jan 18, 2026

augment review

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Updated Docker service status check to use Go template JSON output for broader compatibility and improved parsing robustness. In hybrid_search.py, adjusted multi_granular_query call to ignore unused mg_results, clarifying intent for RRF fusion.
Update docker-compose.neo4j.yml for Neo4j 5.x memory settings. Enhance backend to set Symbol node properties on creation and update, and add a simple degree-based pagerank computation without requiring GDS. Use COALESCE in Cypher queries to avoid warnings for missing properties. Update CLI status command to only show running containers.
@m1rl0k
Copy link
Collaborator Author

m1rl0k commented Jan 18, 2026

augment review

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Refines score accumulation in hybrid ranking to sum entity and relation scores for multi-query scenarios, ensuring accurate contribution tracking. Updates Docker Compose project name handling in status command for better compatibility with custom project names. Enhances Cypher query in Neo4j backend to prefer more specific start_line values and improves language field merging. Adds entity_dense and relation_dense fields to hybrid search result components for better transparency.
@m1rl0k
Copy link
Collaborator Author

m1rl0k commented Jan 18, 2026

augment review

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Changed the IMPORTS upsert logic to prefer the incoming language value for consistency with CALLS upsert. Updated the reset CLI command to forcibly remove any stale indexer container before running a new one with --rm for idempotency.
@m1rl0k m1rl0k merged commit aa8aa25 into test Jan 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant