Skip to content

feat: abort incremental update on failed or empty re-extraction (2/3)#1370

Closed
Const011 wants to merge 1 commit into
Graphify-Labs:v8from
Const011:feat/incremental-safety
Closed

feat: abort incremental update on failed or empty re-extraction (2/3)#1370
Const011 wants to merge 1 commit into
Graphify-Labs:v8from
Const011:feat/incremental-safety

Conversation

@Const011

Copy link
Copy Markdown

Context

Split from former #1326 (closed) per review feedback — part 2 of 3.

Reconciled with upstream #1344 (re-extracted files replace prior contribution in build_merge) and #1350 (--no-cluster incremental no-op). This PR does not change the --no-cluster raw-write contract.

Motivation

Incremental extract is the practical way to update a graph file-by-file. We hit a critical data-loss bug: after editing one markdown file and re-running incremental extract, nodes for that file could disappear entirely while exit code stayed 0.

Problem: incremental update silently shrank the graph

Symptom: Changed file’s nodes vanished after incremental re-extract; unrelated files could survive.

Root cause: Incremental merge always pruned changed files before inserting fresh extraction (prune_sources = deleted + changed). If the LLM chunk “completed” but produced zero nodes (invalid JSON, connection blip, truncation), merge still ran: old nodes pruned, nothing replaced.

Upstream #1344 now auto-replaces nodes/edges for source_file values present in the re-extract batch; this PR adds the guardrail that prevents merge when re-extraction itself failed or returned nothing.

Fix

graphify/build.py — path coverage helpers:

  • source_path_aliases, source_files_in_extraction
  • path_covered_by_extraction, paths_missing_from_extraction

graphify/__main__.py (incremental mode only):

  1. After semantic re-extract, abort with exit 1 if any uncached changed file has no nodes/edges in the fresh result, or if any chunk failed.
  2. Do not write graph.json or update the manifest on failure — existing graph is preserved.
  3. Build prune_sources from deleted files + changed code + successfully re-extracted semantic files only (via path_covered_by_extraction).

Stitch wiring (stitch.py, _stitch_new_ids, etc.) is intentionally not in this PR — see part 3.

Files

File Change
graphify/build.py Path coverage helpers for incremental safety
graphify/__main__.py Abort + _incremental_prune (clustered incremental path)
tests/test_build.py Import formatting

Test plan

  • pytest tests/test_build.py tests/test_extract.py
  • Upstream CI on Linux

Made with Cursor

@Const011

Copy link
Copy Markdown
Author

Part 3 (cross-file stitch) is in #1371 — stacked on this branch (feat/incremental-safety). Please merge this before #1371.

@Const011

Copy link
Copy Markdown
Author

Closing this PR as the proposed mechanics has proven to be inefficient in the longer run.

@Const011 Const011 closed this Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant