Summary
graphify export obsidian aborts with KeyError in to_obsidian when a community's member list contains an id that has no backing node in G. The exporter assumes every clustered community member is a key in G.nodes (and in node_filename), but at least one synthesized member id ('agents_doc' in my run) is not — so the whole vault export crashes instead of skipping the dangling member.
Environment
- graphify(y) 0.8.36
- networkx 3.6.1
- Python 3.13.13
- backend:
claude-cli (semantic), macOS
Repro
On a repo whose graph clusters into many communities (mine: ~10k nodes, 1206 communities after cluster-only), where multiple documents normalize to the same stem (e.g. several *-AGENTS.md / doc files across different directories):
graphify extract . --backend claude-cli
graphify cluster-only "$(pwd)" --backend=claude-cli # succeeds: "Done - 1206 communities. GRAPH_REPORT.md and graph.json updated."
graphify export obsidian --dir "$VAULT" # crashes
Traceback
Traceback (most recent call last):
File ".../bin/graphify", line 10, in <module>
sys.exit(main())
File ".../graphify/__main__.py", line 3712, in main
n = _to_obsidian(G, communities, str(obsidian_dir),
community_labels=labels or None, cohesion=cohesion or None)
File ".../graphify/export.py", line 1010, in to_obsidian
for node_id in sorted(members, key=lambda n: G.nodes[n].get("label", n)):
File ".../graphify/export.py", line 1010, in <lambda>
for node_id in sorted(members, key=lambda n: G.nodes[n].get("label", n)):
File ".../networkx/classes/reportviews.py", line 196, in __getitem__
return self._nodes[n]
KeyError: 'agents_doc'
Root cause
In to_obsidian (export.py:1010), the ## Members section iterates a community's members and dereferences each via G.nodes[n] (in the sort key) and node_filename[node_id] on the next line:
for node_id in sorted(members, key=lambda n: G.nodes[n].get("label", n)):
data = G.nodes[node_id]
node_label = node_filename[node_id]
...
members can contain an id that is not a node in G. Evidence from my run: the offending id "agents_doc" occurs 0 times in graphify-out/graph.json but does appear in graphify-out/.graphify_analysis.json (the sidecar the exporter draws community/label data from). The graph does contain real nodes whose ids end in _agents_doc (e.g. t3x_rte_ckeditor_image_classes_agents_doc and other *-AGENTS.md doc nodes from different directories), so this looks like a normalized/collapsed concept id that ends up in community membership without being materialized as a node in G. Either way, the export layer trusts a 1:1 member → G.nodes mapping that does not hold.
Suggested fix
Make the member iteration defensive so one dangling id can't abort the entire vault export:
members = [m for m in members if m in G.nodes and m in node_filename]
for node_id in sorted(members, key=lambda n: G.nodes[n].get("label", n)):
...
(optionally log.debug the skipped ids). The deeper fix would be upstream — ensure clustering/label assignment only emits real node ids as community members — but guarding the exporter prevents a single synthesized member from taking down the whole export obsidian run.
Workaround
Excluding the directory whose docs produced the colliding stem (via .graphifyignore) and re-running extract removes the synthesized member and lets the export complete.
Summary
graphify export obsidianaborts withKeyErrorinto_obsidianwhen a community's member list contains an id that has no backing node inG. The exporter assumes every clustered community member is a key inG.nodes(and innode_filename), but at least one synthesized member id ('agents_doc'in my run) is not — so the whole vault export crashes instead of skipping the dangling member.Environment
claude-cli(semantic), macOSRepro
On a repo whose graph clusters into many communities (mine: ~10k nodes, 1206 communities after
cluster-only), where multiple documents normalize to the same stem (e.g. several*-AGENTS.md/ doc files across different directories):Traceback
Root cause
In
to_obsidian(export.py:1010), the## Memberssection iterates a community'smembersand dereferences each viaG.nodes[n](in the sort key) andnode_filename[node_id]on the next line:memberscan contain an id that is not a node inG. Evidence from my run: the offending id"agents_doc"occurs 0 times ingraphify-out/graph.jsonbut does appear ingraphify-out/.graphify_analysis.json(the sidecar the exporter draws community/label data from). The graph does contain real nodes whose ids end in_agents_doc(e.g.t3x_rte_ckeditor_image_classes_agents_docand other*-AGENTS.mddoc nodes from different directories), so this looks like a normalized/collapsed concept id that ends up in community membership without being materialized as a node inG. Either way, the export layer trusts a 1:1member → G.nodesmapping that does not hold.Suggested fix
Make the member iteration defensive so one dangling id can't abort the entire vault export:
(optionally
log.debugthe skipped ids). The deeper fix would be upstream — ensure clustering/label assignment only emits real node ids as community members — but guarding the exporter prevents a single synthesized member from taking down the wholeexport obsidianrun.Workaround
Excluding the directory whose docs produced the colliding stem (via
.graphifyignore) and re-runningextractremoves the synthesized member and lets the export complete.