Skip to content

fix(reflect): resolve learning-sidecar source_file against the project root#1558

Closed
TPAteeq wants to merge 1 commit into
Graphify-Labs:v8from
TPAteeq:fix-learning-sidecar-stale-root
Closed

fix(reflect): resolve learning-sidecar source_file against the project root#1558
TPAteeq wants to merge 1 commit into
Graphify-Labs:v8from
TPAteeq:fix-learning-sidecar-stale-root

Conversation

@TPAteeq

@TPAteeq TPAteeq commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

The .graphify_learning.json work-memory overlay (#1441, 5779767) is great — but its
staleness / "re-verify" hint is broken in the standard layout: it fires on every
annotated node, even on a freshly built graph with zero edits.

Symptom

On a clean AST build (no source edits since the build), every node shows the re-verify tag:

Node: .RegisterBlueprint()
  Lesson: preferred source (start here) — 3 useful, score=2.99997647 [code changed since — re-verify]

It surfaces the same way on all four read surfaces — explain (__main__.py),
graph.html (export.py), GRAPH_REPORT.md (report.py), and serve.py. A hint that
fires on everything, always, carries no signal.

Root cause

source_file is stored relative to the project root ("no absolute paths in output",
#555/#932). But graph.json lives in the output dir (graphify-out/), one level below
the project root, and both _code_fingerprint (write) and _is_stale (read) resolved the
relative source_file against graph_path.parent (= graphify-out/):

graphify-out/admin/blueprint.go   # looked up here — never exists
admin/blueprint.go                # actually lives at the project root, one level up

So file_hash always raised, the stored code_fingerprint was always "", and the
read-side _is_stale saw the file as "vanished" → returned True for every node.

Fix

Add _source_root(graph_path) and use it on both sides. It resolves the directory that
source_file is relative to, in order:

  1. the absolute path recorded in <output-dir>/.graphify_root — the same marker the
    git hooks already trust, and the only correct answer when GRAPHIFY_OUT is an
    absolute / elsewhere override (Support for working with worktrees #686/bug: validate_graph_path() fallback and callflow_html.py project-root check hardcode "graphify-out", ignoring GRAPHIFY_OUT #1423);
  2. the parent of the output dir when graph.json sits inside it;
  3. graph.json's own directory (flat layouts; an absolute source_file resolves
    regardless of root).

graph.json itself is still never touched — this only changes which directory the
fingerprint is computed against.

After the fix, same node, same clean build:

Lesson: preferred source (start here) — 3 useful, score=2.99917354

…and a real edit to admin/blueprint.go correctly flips the node to stale.

Why the suite was green

test_loader_marks_entry_stale_when_source_file_changes set source_file to an
absolute path (str(tmp_path / "auth.py")), which skips the relative-path join — so
it exercised a path real graphs never take. Added two regression tests using a relative
source_file (the real-world case): one for the standard graphify-out/ layout, one
driven by the .graphify_root marker. Both fail on v8 (assert '' — empty fingerprint)
and pass with the fix.

Verification

  • tests/test_reflect.py58 passed (incl. the 2 new regression tests).
  • test_reflect + test_export + test_report + test_serve + test_explain_cli186 passed.
  • ruff check graphify/reflect.py — clean.
  • End-to-end on a real AST graph (1065 nodes): fingerprints now populate; the bogus
    re-verify tag is gone; an actual edit re-flags the node.

Scope

Touches only graphify/reflect.py (one helper + two one-line call-site changes) and
tests/test_reflect.py. No version bump, no CHANGELOG entry

…t root

The work-memory sidecar's staleness check resolved each node's relative
`source_file` against graph.json's own directory (`graphify-out/`) instead of
the project root one level up. That path never exists, so every
`code_fingerprint` was written empty and `load_learning_overlay` flagged every
node stale — surfacing "[code changed — re-verify]" on a freshly built graph
with zero edits, across explain/query/report/html. The hint fired on
everything, always, so it carried no signal.

Add `_source_root()` and use it on both the write side (fingerprint) and the
read side (staleness): prefer the absolute path recorded in
`<output-dir>/.graphify_root` (the marker the git hooks already trust, and the
only correct answer when `GRAPHIFY_OUT` is an absolute/elsewhere override),
else the parent of the output dir, else graph.json's own directory.

The existing stale test passed only because it used an *absolute* `source_file`,
which skips the relative-path join real graphs always take ("no absolute paths
in output", Graphify-Labs#555/Graphify-Labs#932). Add two regression tests with a relative `source_file`:
one for the standard layout, one driven by the `.graphify_root` marker.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@TPAteeq

TPAteeq commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Superseded by 00e00a0 on v8, which fixes the same work-memory staleness false-positive (relative source_file resolved against graphify-out/ instead of the project root, so every verdict was flagged 'code changed — re-verify' on a fresh build). That implementation is cleaner than this one — it hashes file content only (root-independent, so a committed sidecar stays valid across checkouts) and tries candidate roots in order rather than guessing from the directory name. Closing as already-fixed. Thanks!

@TPAteeq TPAteeq closed this Jun 30, 2026
safishamsi added a commit that referenced this pull request Jun 30, 2026
…1558)

Refines the staleness file resolution (00e00a0) by folding in the two
genuine merits of @TPAteeq's parallel fix (#1558), which independently
and correctly diagnosed the same root-mismatch bug:

- Layout-ordered candidates: try the layout-appropriate root FIRST (the
  graphify-out parent for the standard layout, graph.json's own dir for a
  flat layout) before the other. The prior order tried the grandparent
  first unconditionally, which in a flat layout (graph.json at the project
  root) could fingerprint a same-named file one directory up. Existence
  checking is kept on top, so a defeated name heuristic or a stale
  .graphify_root marker still falls through to the real file.
- Adds @TPAteeq's .graphify_root-marker-driven regression test, plus a
  flat-layout test that pins the ordering (editing the real file flips
  stale; editing the same-named decoy one dir up does not).

Co-Authored-By: tpateeq <mohammedateequddin399@gmail.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@safishamsi

Copy link
Copy Markdown
Collaborator

Thank you — and great catch. You independently diagnosed this exactly right: source_file is relative to the project root, both the write and read sides resolved it against graph.json's dir (graphify-out/), so the file was never found, the fingerprint stayed empty, and every node got the re-verify tag on a clean build. You even spotted the same reason the suite stayed green (the existing staleness test used an absolute source_file, which skips the relative join).

The fix landed on v8 a few hours before this PR (00e00a0), so this would conflict — but your version had two things mine didn't, and I've folded both in (c865a3c, credited to you):

  1. Layout-ordered root selection. My first cut tried graph.json's grandparent before its own dir unconditionally, which in a flat layout (graph.json at the project root) could resolve to a same-named file one directory up. Your graphify-out-name heuristic picks the right root first; I took that ordering (kept on top of existence-checking, so a stale .graphify_root marker or a defeated heuristic still falls through to the real file).
  2. Your .graphify_root-marker regression test — I added it, plus a flat-layout test that pins the ordering (editing the real file flips stale; editing a same-named decoy one dir up does not).

So the bug is fixed and your contribution is in the tree. This is the second time you've driven this overlay feature (the original idea in #1542, now this) — genuinely appreciated. Closing as landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants