You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
graphify update never prunes a deleted import's edge → stale edges → false structural findings
Component:graphify (PyPI graphifyy) incremental update Affected versions: confirmed on 0.8.51 (latest) and 0.8.44 Severity: correctness — produces silently wrong graphs that drive false analysis (e.g. phantom circular-dependency reports) Platform observed: Windows 11, Python 3.13 (behavior is in pure graph-merge logic, not platform-specific)
Summary
When an import (or any edge-producing reference) is deleted from a file, graphify update re-extracts the file and writes a new graph, but the old edge is carried forward — it is never pruned. graphify update --force does not fix it either; only a full clean rebuild (delete graph.json, then update) removes the stale edge.
Because the edge survives, downstream analysis is wrong. In our case a file used to import another, the import was refactored out (replaced by a registration/callback pattern), and the stale edge made the dependency graph report a circular dependency that no longer exists for months — until a clean rebuild was forced.
Minimal reproduction
mkdir -p repro/src &&cd repro
printf"import { foo } from './b';\nexport function useA(): void { foo(); }\n"> src/a.ts
printf"export function foo(): void {}\n"> src/b.ts
graphify update .# builds graph: a.ts -> b.ts import edge existsprintf"export function useA(): void {}\n"> src/a.ts # DELETE the import
graphify update .# re-extract + rebuild# BUG: graphify-out/graph.json still contains the a.ts -> b.ts import edge.# `graphify update . --force` does not remove it either.# Only `rm -f graphify-out/graph.json && graphify update .` removes it.
Check programmatically:
importjsong=json.load(open("graphify-out/graph.json"))
id2f= {n["id"]: n.get("source_file") forning["nodes"]}
print([lforling["links"]
ifl.get("relation") in ("imports", "imports_from")
and"a.ts"instr(l.get("source_file") orid2f.get(l["source"]))
and"b.ts"instr(id2f.get(l["target"]))])
# Expected after deleting the import: [] Actual: two stale edges.
Expected vs actual
Expected: after re-extraction, a re-extracted file's removed edges are gone from the graph.
Actual: the removed edges persist. graphify update prints e.g. [graphify watch] Rebuilt: 4 nodes, 5 edges — the 5 edges still include the 2 stale a.ts -> b.ts import edges.
The surviving edge clearly belongs to the changed file (note source_file):
flowchart TD
U["graphify update ."] --> RC["watch.py _rebuild_code<br/>(prints 'Rebuilt: N edges')"]
RC --> EX["extract re-extracted files"]
EX --> WR["write graph.json"]
WR --> STALE["deleted import's edge survives — BUG"]
BM["build.py build_merge<br/>(518-533: drop existing nodes/edges<br/>whose source_file was re-extracted)"]
RC -. "does NOT call / does NOT replicate" .-> BM
BM -. "recommended: apply this prune here too" .-> RC
Loading
The correct stale-edge prevention already exists in build.py::build_merge:
build.py:480-483 — "Re-extracted files REPLACE their prior contribution: any source_file present in new_chunks is dropped from the loaded graph before merging, so a changed file's stale nodes/edges don't accumulate."
…implemented at build.py:518-533: collect new_sources (every source_file re-extracted) and drop every existing node and edge whose source_file is in that set.
But the graphify update CLI path does not use build_merge. It goes through watch.py (_rebuild_code, the path that prints [graphify watch] Rebuilt: …). That module contains no reference to build_merge, new_sources, or the source_file-replacement logic — it builds the graph from extract(...) output and writes it without dropping a re-extracted file's prior edges. So the very protection build_merge documents is bypassed on the most common path (graphify update, which the README and watch hooks recommend running after edits).
Additional notes:
--force only relaxes the node-shrink guard (build.py:588-597, "refuse to shrink graph … pass prune_sources"); it does not add edge pruning, so it does not help.
The same_topology short-circuit in watch.py:740-759 is not the cause here — in the repro the graph is rebuilt and rewritten; the rewritten graph simply still contains the stale edges.
Recommended fix
Apply build_merge's source_file replacement on the incremental update path too. Either:
Route the update/_rebuild_code merge through build_merge() (preferred — single source of truth), or
Replicate build.py:518-533 in watch.py's rebuild: before/while merging the re-extracted chunks into the existing graph, drop every existing node and edge whose source_file is among the re-extracted files.
Edges reliably carry their own source_file (e.g. the stale edge above has "source_file": "src/a.ts"), so the edge-side prune is straightforward and symmetric with the node-side prune. This makes graphify update self-correcting and removes the need for users to periodically rm -f graph.json.
Impact / why it matters
graphify update is the documented "keep the graph current" command and is commonly wired into commit hooks / watch. Every deleted import (or other removed reference) leaves a ghost edge with no warning, which silently corrupts structural analyses — circular-dependency detection, hub/coupling metrics, impact maps — until someone happens to do a full clean rebuild. The graph drifts further from reality the longer incremental updates run.
graphify updatenever prunes a deleted import's edge → stale edges → false structural findingsComponent:
graphify(PyPIgraphifyy) incrementalupdateAffected versions: confirmed on 0.8.51 (latest) and 0.8.44
Severity: correctness — produces silently wrong graphs that drive false analysis (e.g. phantom circular-dependency reports)
Platform observed: Windows 11, Python 3.13 (behavior is in pure graph-merge logic, not platform-specific)
Summary
When an
import(or any edge-producing reference) is deleted from a file,graphify updatere-extracts the file and writes a new graph, but the old edge is carried forward — it is never pruned.graphify update --forcedoes not fix it either; only a full clean rebuild (deletegraph.json, thenupdate) removes the stale edge.Because the edge survives, downstream analysis is wrong. In our case a file used to
importanother, the import was refactored out (replaced by a registration/callback pattern), and the stale edge made the dependency graph report a circular dependency that no longer exists for months — until a clean rebuild was forced.Minimal reproduction
Check programmatically:
Expected vs actual
graphify updateprints e.g.[graphify watch] Rebuilt: 4 nodes, 5 edges— the 5 edges still include the 2 stalea.ts -> b.tsimport edges.The surviving edge clearly belongs to the changed file (note
source_file):{"relation": "imports_from", "context": "import", "source_file": "src/a.ts", "source_location": "L1", "source": "src_a", "target": "src_b"}Root cause (with code references, v0.8.51)
flowchart TD U["graphify update ."] --> RC["watch.py _rebuild_code<br/>(prints 'Rebuilt: N edges')"] RC --> EX["extract re-extracted files"] EX --> WR["write graph.json"] WR --> STALE["deleted import's edge survives — BUG"] BM["build.py build_merge<br/>(518-533: drop existing nodes/edges<br/>whose source_file was re-extracted)"] RC -. "does NOT call / does NOT replicate" .-> BM BM -. "recommended: apply this prune here too" .-> RCThe correct stale-edge prevention already exists in
build.py::build_merge:…implemented at
build.py:518-533: collectnew_sources(everysource_filere-extracted) and drop every existing node and edge whosesource_fileis in that set.But the
graphify updateCLI path does not usebuild_merge. It goes throughwatch.py(_rebuild_code, the path that prints[graphify watch] Rebuilt: …). That module contains no reference tobuild_merge,new_sources, or the source_file-replacement logic — it builds the graph fromextract(...)output and writes it without dropping a re-extracted file's prior edges. So the very protectionbuild_mergedocuments is bypassed on the most common path (graphify update, which the README and watch hooks recommend running after edits).Additional notes:
--forceonly relaxes the node-shrink guard (build.py:588-597, "refuse to shrink graph … pass prune_sources"); it does not add edge pruning, so it does not help.same_topologyshort-circuit inwatch.py:740-759is not the cause here — in the repro the graph is rebuilt and rewritten; the rewritten graph simply still contains the stale edges.Recommended fix
Apply
build_merge's source_file replacement on the incrementalupdatepath too. Either:update/_rebuild_codemerge throughbuild_merge()(preferred — single source of truth), orbuild.py:518-533inwatch.py's rebuild: before/while merging the re-extracted chunks into the existing graph, drop every existing node and edge whosesource_fileis among the re-extracted files.Edges reliably carry their own
source_file(e.g. the stale edge above has"source_file": "src/a.ts"), so the edge-side prune is straightforward and symmetric with the node-side prune. This makesgraphify updateself-correcting and removes the need for users to periodicallyrm -f graph.json.Impact / why it matters
graphify updateis the documented "keep the graph current" command and is commonly wired into commit hooks / watch. Every deleted import (or other removed reference) leaves a ghost edge with no warning, which silently corrupts structural analyses — circular-dependency detection, hub/coupling metrics, impact maps — until someone happens to do a full clean rebuild. The graph drifts further from reality the longer incremental updates run.