Bug: graphify cluster-only doesn't leave .graphify_analysis.json behind, so a subsequent graphify export html silently skips with "Single community"
Version: graphifyy 0.8.13 (installed via uv tool install)
Steps to reproduce
- Run the full pipeline once on a corpus large enough to exceed the HTML viz node limit (>5000 nodes), so
graphify-out/graph.json and graphify-out/GRAPH_REPORT.md exist.
- Manually edit
graphify-out/graph.json (e.g. merge two duplicate nodes) so the graph no longer matches the last full extraction.
- Run
graphify cluster-only . to re-cluster the edited graph. It reports success (e.g. Done — 387 communities. GRAPH_REPORT.md and graph.json updated.) and, in my case, also printed Skipped graph.html: ... too large for HTML viz (limit: 5000).
- Run
graphify export html to get the aggregated community view for the oversized graph.
Expected
graphify export html builds the aggregated community meta-graph (one node per community, as it does when run right after the initial full pipeline) and writes graph.html.
Actual
Graph has 5558 nodes (above 5000 limit). Building aggregated community view...
Single community - aggregated view not useful. Skipping graph.html.
No error, no indication anything is wrong — it just silently produces a 0-node/1-node-looking result and skips the file.
Root cause
export html's CLI handler reads community assignments from a specific file, not from graph.json's own per-node community attribute:
# graphify/__main__.py, ~line 2119
analysis_path = Path(_GRAPHIFY_OUT) / ".graphify_analysis.json"
...
# ~line 2240-2242
communities: dict[int, list[str]] = {}
if analysis_path.exists():
_an = json.loads(analysis_path.read_text(encoding="utf-8"))
communities = {int(k): v for k, v in _an.get("communities", {}).items()}
graphify cluster-only never writes .graphify_analysis.json — it computes communities internally and writes only graph.json (with per-node community fields) and GRAPH_REPORT.md, then presumably cleans up its own intermediates. So after a cluster-only run, .graphify_analysis.json doesn't exist, communities ends up {}, and to_html's aggregation path (graphify/export.py, to_html(), the node_limit is not None branch) iterates zero communities:
meta = _nx.Graph()
for cid, members in communities.items(): # communities == {} here
meta.add_node(...)
...
if meta.number_of_nodes() <= 1:
print("Single community - aggregated view not useful. Skipping graph.html.")
return
— hence the misleading "Single community" message on a graph that actually has hundreds of real communities (still recoverable from graph.json's per-node community field).
Suggested fix
Either:
- Have
cluster-only also (re)write .graphify_analysis.json (communities/cohesion/gods/surprises) alongside graph.json and GRAPH_REPORT.md, matching what the full pipeline leaves behind at the equivalent step, or
- Have
export html's CLI handler fall back to deriving communities from graph.json's per-node community field when .graphify_analysis.json is missing, rather than defaulting to {}.
Either fix would also make the "Single community" message accurate again — right now it fires even when the graph clearly has many communities, which is confusing to debug from the CLI output alone (I only found the cause by reading graphify/export.py and graphify/__main__.py source directly).
Workaround
Manually reconstruct and write graphify-out/.graphify_analysis.json with {"communities": ..., "cohesion": ..., "gods": ..., "surprises": ...} (derived from graph.json's per-node community field) before calling graphify export html.
Bug:
graphify cluster-onlydoesn't leave.graphify_analysis.jsonbehind, so a subsequentgraphify export htmlsilently skips with "Single community"Version: graphifyy 0.8.13 (installed via
uv tool install)Steps to reproduce
graphify-out/graph.jsonandgraphify-out/GRAPH_REPORT.mdexist.graphify-out/graph.json(e.g. merge two duplicate nodes) so the graph no longer matches the last full extraction.graphify cluster-only .to re-cluster the edited graph. It reports success (e.g.Done — 387 communities. GRAPH_REPORT.md and graph.json updated.) and, in my case, also printedSkipped graph.html: ... too large for HTML viz (limit: 5000).graphify export htmlto get the aggregated community view for the oversized graph.Expected
graphify export htmlbuilds the aggregated community meta-graph (one node per community, as it does when run right after the initial full pipeline) and writesgraph.html.Actual
No error, no indication anything is wrong — it just silently produces a 0-node/1-node-looking result and skips the file.
Root cause
export html's CLI handler reads community assignments from a specific file, not fromgraph.json's own per-nodecommunityattribute:graphify cluster-onlynever writes.graphify_analysis.json— it computes communities internally and writes onlygraph.json(with per-nodecommunityfields) andGRAPH_REPORT.md, then presumably cleans up its own intermediates. So after acluster-onlyrun,.graphify_analysis.jsondoesn't exist,communitiesends up{}, andto_html's aggregation path (graphify/export.py,to_html(), thenode_limit is not Nonebranch) iterates zero communities:— hence the misleading "Single community" message on a graph that actually has hundreds of real communities (still recoverable from
graph.json's per-nodecommunityfield).Suggested fix
Either:
cluster-onlyalso (re)write.graphify_analysis.json(communities/cohesion/gods/surprises) alongsidegraph.jsonandGRAPH_REPORT.md, matching what the full pipeline leaves behind at the equivalent step, orexport html's CLI handler fall back to derivingcommunitiesfromgraph.json's per-nodecommunityfield when.graphify_analysis.jsonis missing, rather than defaulting to{}.Either fix would also make the "Single community" message accurate again — right now it fires even when the graph clearly has many communities, which is confusing to debug from the CLI output alone (I only found the cause by reading
graphify/export.pyandgraphify/__main__.pysource directly).Workaround
Manually reconstruct and write
graphify-out/.graphify_analysis.jsonwith{"communities": ..., "cohesion": ..., "gods": ..., "surprises": ...}(derived fromgraph.json's per-nodecommunityfield) before callinggraphify export html.