feat: abort incremental update on failed or empty re-extraction (2/3)#1370
Closed
Const011 wants to merge 1 commit into
Closed
feat: abort incremental update on failed or empty re-extraction (2/3)#1370Const011 wants to merge 1 commit into
Const011 wants to merge 1 commit into
Conversation
3 tasks
Author
Author
|
Closing this PR as the proposed mechanics has proven to be inefficient in the longer run. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Split from former #1326 (closed) per review feedback — part 2 of 3.
Reconciled with upstream #1344 (re-extracted files replace prior contribution in
build_merge) and #1350 (--no-clusterincremental no-op). This PR does not change the--no-clusterraw-write contract.Motivation
Incremental
extractis the practical way to update a graph file-by-file. We hit a critical data-loss bug: after editing one markdown file and re-running incremental extract, nodes for that file could disappear entirely while exit code stayed 0.Problem: incremental update silently shrank the graph
Symptom: Changed file’s nodes vanished after incremental re-extract; unrelated files could survive.
Root cause: Incremental merge always pruned changed files before inserting fresh extraction (
prune_sources = deleted + changed). If the LLM chunk “completed” but produced zero nodes (invalid JSON, connection blip, truncation), merge still ran: old nodes pruned, nothing replaced.Upstream #1344 now auto-replaces nodes/edges for
source_filevalues present in the re-extract batch; this PR adds the guardrail that prevents merge when re-extraction itself failed or returned nothing.Fix
graphify/build.py— path coverage helpers:source_path_aliases,source_files_in_extractionpath_covered_by_extraction,paths_missing_from_extractiongraphify/__main__.py(incremental mode only):graph.jsonor update the manifest on failure — existing graph is preserved.prune_sourcesfrom deleted files + changed code + successfully re-extracted semantic files only (viapath_covered_by_extraction).Stitch wiring (
stitch.py,_stitch_new_ids, etc.) is intentionally not in this PR — see part 3.Files
graphify/build.pygraphify/__main__.py_incremental_prune(clustered incremental path)tests/test_build.pyTest plan
pytest tests/test_build.py tests/test_extract.pyMade with Cursor