Bug
graphify.extract (AST mode) generates node IDs from {filename_stem}_{symbol}. In monorepos where many folders contain a same-named file (e.g. Next.js App Router with one action-bar/index.tsx per feature, or one lib/atom.ts per feature), every function with the same name across those files collapses into a single graph node. The union of their edges then creates phantom cross-module "bridges" that don't exist in source.
Repro
In a Next.js repo with structure:
app/(innerPages)/<feature>/action-bar/index.tsx # each exports `function ActionBar()`
app/(innerPages)/<feature>/lib/atom.ts # each exports `useFeatureAtom`
Run graphify . on a code-only corpus.
Result: a single node action_bar_index_actionbar accumulates contains edges from ~16 distinct files and calls edges to atoms from ~10 unrelated features. god_nodes() reports it as the most connected abstraction. surprising_connections() flags fake cross-feature bridges.
Expected
IDs should be unique per source file, e.g. derived from the full relative path (innerpages_<feature>_action_bar_index_actionbar) or filepath hash, so functions with identical names in different folders remain distinct nodes.
Evidence
In one 552-file scan, 16 action-bar/index.tsx files merged into 1 node showing 30 edges and 0.167 betweenness — entirely an artifact.
Suggested fix
Change ID generation in the AST extractor to use the relative source path (slashes → underscores) instead of just Path(...).stem. This is also what the Step 3 prompt instructs semantic subagents to do, so the AST extractor would just be matching its sibling.
Version
graphifyy from PyPI, Python 3.x via pipx.
Bug
graphify.extract(AST mode) generates node IDs from{filename_stem}_{symbol}. In monorepos where many folders contain a same-named file (e.g. Next.js App Router with oneaction-bar/index.tsxper feature, or onelib/atom.tsper feature), every function with the same name across those files collapses into a single graph node. The union of their edges then creates phantom cross-module "bridges" that don't exist in source.Repro
In a Next.js repo with structure:
Run
graphify .on a code-only corpus.Result: a single node
action_bar_index_actionbaraccumulatescontainsedges from ~16 distinct files andcallsedges to atoms from ~10 unrelated features.god_nodes()reports it as the most connected abstraction.surprising_connections()flags fake cross-feature bridges.Expected
IDs should be unique per source file, e.g. derived from the full relative path (
innerpages_<feature>_action_bar_index_actionbar) or filepath hash, so functions with identical names in different folders remain distinct nodes.Evidence
In one 552-file scan, 16
action-bar/index.tsxfiles merged into 1 node showing 30 edges and 0.167 betweenness — entirely an artifact.Suggested fix
Change ID generation in the AST extractor to use the relative source path (slashes → underscores) instead of just
Path(...).stem. This is also what the Step 3 prompt instructs semantic subagents to do, so the AST extractor would just be matching its sibling.Version
graphifyy from PyPI, Python 3.x via pipx.