Skip to content

fix: persist cluster-only analysis sidecar#1617

Closed
sanmaxdev wants to merge 1 commit into
Graphify-Labs:v8from
sanmaxdev:fix/cluster-only-analysis-sidecar
Closed

fix: persist cluster-only analysis sidecar#1617
sanmaxdev wants to merge 1 commit into
Graphify-Labs:v8from
sanmaxdev:fix/cluster-only-analysis-sidecar

Conversation

@sanmaxdev

Copy link
Copy Markdown
Contributor

Summary

  • Persist .graphify_analysis.json during graphify cluster-only / graphify label runs.
  • Add a regression test that verifies the refreshed sidecar matches the communities written to graph.json.

Testing

  • uv run --frozen pytest tests/test_cli_export.py -q --tb=short
  • uv run --frozen pytest tests/test_cli_export.py tests/test_serve.py -q --tb=short
  • uv run --frozen pytest tests/ -q --tb=short
  • uv run --frozen ruff check graphify/__main__.py tests/test_cli_export.py
  • uv run --frozen python -m tools.skillgen --check
  • uv run --frozen python -m tools.skillgen --audit-coverage
  • uv run --frozen python -m tools.skillgen --schema-singleton
  • uv run --frozen python -m tools.skillgen --monolith-roundtrip
  • uv run --frozen python -m tools.skillgen --always-on-roundtrip
  • uv run --frozen graphify --help
  • uv run --frozen graphify install
  • git diff --check

Closes #1610

safishamsi added a commit that referenced this pull request Jul 2, 2026
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@safishamsi

Copy link
Copy Markdown
Collaborator

Merged into v8 as 62f49ba (your authorship) — this closes #1610. Verified cluster-only now writes .graphify_analysis.json with communities/cohesion/gods/surprises/questions, matching the full extract path's shape, so a later export html no longer reports "Single community". Full suite 2841 green. Ships next release. Thanks!

@safishamsi safishamsi closed this Jul 2, 2026
nokternol added a commit to nokternol/graphify that referenced this pull request Jul 4, 2026
…hify-Labs#1617)

_score_nodes' "joined" full-query tier exists so a multi-word query that
equals (or prefixes) a whole multi-word label wins outright, since no single
token in a bag-of-words sum could otherwise equal that label. For a
single-token probe, this degenerates: `joined` equals the lone term, and any
node whose *tokenized* label (punctuation stripped) happens to reduce to
exactly that one word - e.g. a bare method call like `.search()`, whose only
word character content is "search" - gets promoted to the EXACT tier via the
`label_tokens` comparison, even though the same node correctly fails the
per-token loop's own raw `t == norm_label or t == bare_label` exact check a
few lines below (raw ".search" != "search").

This matters most inside `_pick_seeds`' per-term seed-diversity guarantee
(Graphify-Labs#1445), which probes each distinct query term in isolation via
`_score_nodes(G, [term])`: a short, same-named method repeated across
several unrelated files (three metadata providers each define their own
`.search()`) can win that single-term probe's EXACT tier outright and starve
out the actually-relevant multi-word file, which only reaches the PREFIX
tier for the same bare word. Reproduced live: `graphify query "how does a
change in provider settings affect what shows up in search results"` seeded
on one provider's unrelated `.search()` method and never surfaced
`search.handler.ts` at all, despite that file scoring far higher (494 vs 7)
under the query's full multi-word sentence - the bug is specific to the
single-term isolation probe, not the combined-query scoring path.

Fix: gate the joined-tier block on `len(norm_terms) > 1`. A single-token
probe has no "multi-word phrase vs per-token bag-of-words" distinction to
make in the first place - the per-token loop directly below already fully
and correctly handles single-term exact/prefix/substring matching via raw,
non-tokenized label comparison, so the bonus is both redundant and (as
shown) actively harmful when only one token is being scored. The combined
multi-word query path is unchanged, since len(norm_terms) > 1 there.

Regression tests: an isolated single-token probe now ranks the real
multi-word file above the same-named bare method; `_pick_seeds`' per-term
diversity guarantee no longer seeds the bare method over the relevant file
end-to-end. Full suite (2766 tests, 1 pre-existing unrelated failure) and
ruff pass. Verified live: search.handler.ts and its exported symbols now
appear in the traversal for the exact query that previously missed them
entirely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cluster-only doesn't persist .graphify_analysis.json, causing export html to silently report "Single community"

2 participants