Skip to content

feat(graph,mcp): annotate_nodes — write external metadata back onto graph nodes#162

Open
avfirsov wants to merge 1 commit into
zzet:mainfrom
avfirsov:feat/graph-annotate-nodes
Open

feat(graph,mcp): annotate_nodes — write external metadata back onto graph nodes#162
avfirsov wants to merge 1 commit into
zzet:mainfrom
avfirsov:feat/graph-annotate-nodes

Conversation

@avfirsov

Copy link
Copy Markdown
Contributor

What

Adds a sanctioned, additive write-back path for external metadata so a downstream tool — an LLM enrichment pass, an analysis stage, an editor extension — can attach its own metadata to existing graph nodes without re-indexing and without racing the sharded in-memory store.

Today there is no safe way for a caller holding a graph.Store (rather than a concrete *Graph) to mutate Node.Meta: the shard lock the in-memory backend takes is private, so reaching into GetNode(id).Meta directly races the sharded map and panics on a live daemon. This PR closes that gap.

How

  • graph.Store.MergeNodeMeta(id, kv) (changed, found) — the only sanctioned Node.Meta mutation path for a Store holder. Additive and idempotent (deep-equal delta via the pure metaDelta helper), taken under the node's shard write lock, and it never touches structural fields (id / kind / name / path / lines). Implemented for both the in-memory and SQLite backends and covered by a shared store-conformance case so the two behave identically.
  • annotate_nodes MCP tool (also served at POST /v1/tools/annotate_nodes through the shared registry — no separate HTTP code): merges a free-form per-node meta map under a caller-chosen namespace (default ext). Every key is stored as <namespace>_<key>, so an annotation can never shadow an indexer-owned Meta key. Optionally adds idempotent semantically_related edges between node pairs. Returns {annotated, unchanged, not_found, edges_added}.

The merge is deliberately scoped: never deletes keys, never mutates structural data, non-fatal per item (an unknown id is recorded in not_found, the batch continues).

Tests

  • Pure metaDelta unit (delta/idempotency/nil-handling).
  • In-memory + SQLite conformance (MergeNodeMeta): additive merge, deep-equal idempotency, found semantics for an unknown id, lazy Meta init, structural fields untouched.
  • MCP handler: round-trip, idempotent re-run, namespace prefixing (incl. no double-prefix), semantically_related edge add + dedup, default score, bad input, and a registration guard.

All green; go build ./..., go vet, and gofmt clean on the changed files.

Notes / scope

  • Cross-restart durability of annotations is intentionally out of scope for this v1: the merge mutates in-memory state and rides the gob+gzip shutdown snapshot (the SQLite backend re-persists the node). Happy to follow up with explicit per-call persistence if you'd prefer.

🤖 Generated with Claude Code

…nodes

Add a sanctioned, additive write-back path so a downstream tool (an LLM,
an analysis pass, an editor extension) can enrich existing graph nodes
with its own metadata — without re-indexing and without racing the
sharded in-memory store.

- graph.Store.MergeNodeMeta(id, kv) (changed, found): the only sanctioned
  way for a Store holder to mutate Node.Meta. Additive + idempotent
  (deep-equal delta via metaDelta), shard-locked, and never touches
  structural fields (id/kind/name/path/lines). Implemented for both the
  in-memory and SQLite backends and covered by a shared store-conformance
  case so they behave identically.
- mcp annotate_nodes tool (also served at POST /v1/tools/annotate_nodes
  through the shared registry): merges a free-form per-node `meta` map
  under a caller-chosen `namespace` (default "ext") so annotations can
  never shadow indexer-owned keys, and optionally adds idempotent
  semantically_related edges between node pairs.

Tests: pure metaDelta unit, in-memory + SQLite conformance, and MCP
handler round-trip / idempotency / namespace / edge / bad-input plus a
registration guard.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XcQY8fFFyDQidwVC8dKHWh
@zzet

zzet commented Jun 23, 2026

Copy link
Copy Markdown
Owner

@avfirsov thank you for submitting an interesting feature.

Overall, it looks good to me, excluding only one aspect:
The Meta attribute contains data that Gortex relies on, and external writes to that attribute can be both useful and harmful. Considering the prefix, half of the potential issues are mitigated; however, an uncontrolled increase in the JSON column can have the following impact on performance (which can lead to a chain of consequences, up to MCP tool penalisation during the session).

What about adding an additional attribute to the table and persisting user input metadata into an additional JSON column, and loading it when it is necessary (not on every node extraction)?

Second point, which may not be expected from the user POV - an in-session reindex of a changed file evicts and re-parses that file's nodes — which drops both the merged ext_* Meta and the semantically_related edges, on the SQLite backend too (reindex INSERT OR REPLACE overwrites Meta). So annotations are lost on any edit to the annotated file, not just on restart.

Also, Minor DRY: the SQLite path re-implements the metaDelta loop inline instead of reusing the pure helper. It's forced today (the helper is unexported in package graph); exporting graph.MetaDelta and calling it would remove the duplication the PR itself flags in a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants