fix(sessions): index crystallize summary page into FTS5 (#60)#61
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughCrystallize now indexes the session-summary page into FTS5 immediately after storing it; the PR adds a changelog entry and a test that verifies the summary page is discoverable via FTS5 search. ChangesSession crystallize FTS5 indexing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
`crystallize()` wrote the session-summary Page via `store.put_page()` only, which embeds but does not populate `pages_fts`. On KBs with a populated `state.db`, FTS5 returns hits for other artifacts and short-circuits the substring fallback, so `session-<id>` pages were silently absent from `vouch search` / `kb.search` / `kb.context`. Mirror the PAGE branch of `proposals.approve()` by opening an `index_db` connection and calling `index_db.index_page(...)` immediately after the page write. Closes vouchdev#60.
70b9c69 to
29635de
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/vouch/sessions.py`:
- Around line 110-114: The index write in crystallize() currently calls
index_db.open_db(...) and index_db.index_page(...) after approvals and
store.put_page(page) and any exception there will abort the whole crystallize
flow; wrap the index write in a safe, non-fatal try/except so index errors are
logged but do not raise (e.g., try: with index_db.open_db(store.kb_dir) as conn:
index_db.index_page(conn, id=page.id, ...) except Exception as e:
logger.error("index_page failed for page=%s: %s", page.id, e) ), ensuring
approvals and store.put_page remain successful even if index_db.open_db or
index_db.index_page fails and use the existing logger to record the failure for
later investigation.
In `@tests/test_sessions.py`:
- Around line 62-63: The FTS assertion uses an unindexed field (summary_id) so
change the search query to use the indexed session identifier used in the
summary text (sess.id) to make the test stable: call index_db.search with
sess.id (instead of summary_id) against store.kb_dir and keep the assertion that
checks for kind == "page" and hid == summary_id (or update the expected hid if
necessary) so the test relies on an indexed query term rather than unindexed
pages_fts.id.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: b225021e-33db-4050-b56b-72ee161feb29
📒 Files selected for processing (3)
CHANGELOG.mdsrc/vouch/sessions.pytests/test_sessions.py
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
| with index_db.open_db(store.kb_dir) as conn: | ||
| index_db.index_page( | ||
| conn, id=page.id, title=page.title, body=page.body, | ||
| type=page.type.value, tags=page.tags, | ||
| ) |
There was a problem hiding this comment.
Handle index-write failures without aborting crystallization.
On Line 110, an indexing DB error will raise out of crystallize() after approvals and store.put_page(page) have already succeeded. That makes this secondary index write a hard failure for the whole operation.
Suggested hardening
+import sqlite3
...
- with index_db.open_db(store.kb_dir) as conn:
- index_db.index_page(
- conn, id=page.id, title=page.title, body=page.body,
- type=page.type.value, tags=page.tags,
- )
+ try:
+ with index_db.open_db(store.kb_dir) as conn:
+ index_db.index_page(
+ conn, id=page.id, title=page.title, body=page.body,
+ type=page.type.value, tags=page.tags,
+ )
+ except sqlite3.Error:
+ logger.exception(
+ "crystallize: failed to index summary page %s", page.id
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with index_db.open_db(store.kb_dir) as conn: | |
| index_db.index_page( | |
| conn, id=page.id, title=page.title, body=page.body, | |
| type=page.type.value, tags=page.tags, | |
| ) | |
| try: | |
| with index_db.open_db(store.kb_dir) as conn: | |
| index_db.index_page( | |
| conn, id=page.id, title=page.title, body=page.body, | |
| type=page.type.value, tags=page.tags, | |
| ) | |
| except sqlite3.Error: | |
| logger.exception( | |
| "crystallize: failed to index summary page %s", page.id | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/vouch/sessions.py` around lines 110 - 114, The index write in
crystallize() currently calls index_db.open_db(...) and index_db.index_page(...)
after approvals and store.put_page(page) and any exception there will abort the
whole crystallize flow; wrap the index write in a safe, non-fatal try/except so
index errors are logged but do not raise (e.g., try: with
index_db.open_db(store.kb_dir) as conn: index_db.index_page(conn, id=page.id,
...) except Exception as e: logger.error("index_page failed for page=%s: %s",
page.id, e) ), ensuring approvals and store.put_page remain successful even if
index_db.open_db or index_db.index_page fails and use the existing logger to
record the failure for later investigation.
…vouchdev#61 review) `pages_fts.id` is declared UNINDEXED, so the previous query relied on tokenization side effects of summary_id matching the title. Switch the search term to sess.id, which is explicitly present in the indexed title and body, so the test contract is stable against future title-format changes. Also drop the test-body comment that restated what the test name and assertions already convey.
|
Fix MR conflict |
…crystallize-fts5-index-summary-page # Conflicts: # CHANGELOG.md # tests/test_sessions.py
|
@plind-junior Thanks for your review. |
|
LGTM! |
|
Thanks for your work. |
Resolves conflicts in CHANGELOG.md, tests/test_sessions.py, and auto-merges src/vouch/sessions.py with the recently-merged: - vouchdev#61 (FTS5 indexing of crystallize summary page) — adds index_db.index_page(...) right after store.put_page(...) inside crystallize. Composes cleanly with this PR's strict-derivation _build_summary_body and the summary_page_id audit fix. - vouchdev#62 (test_crystallize_collects_approval_failures) — preserved as a sibling test. - vouchdev#75 (bundle import sha256 verification) — CHANGELOG entry only. - vouchdev#73 (bundle POSIX separators) — CHANGELOG entry only. Both regression tests added by this PR (test_crystallize_summary_page_does_not_leak_agent_controlled_fields and test_crystallize_audit_event_records_summary_page_id) still pass against the merged sessions.crystallize.
What changed
sessions.crystallize()now indexes the session-summary Page into FTS5the same way
proposals.approve()does for an approved PAGE proposal.src/vouch/sessions.py:109previously wrote the summary viastore.put_page(page)only, which writes the markdown to disk andpopulates the embedding index but does not touch
pages_fts. Mirrorthe existing PAGE branch of
proposals.approve()by opening anindex_dbconnection and callingindex_db.index_page(...)immediatelyafter the page write.
Why
vouch search,kb.search, and the FTS5 branch ofkb.contextallshort-circuit on the first non-empty FTS5 result (see
src/vouch/context.py:29-40). The substring fallback only runs when FTS5returns nothing — so on any KB with a populated
state.db, everysession-<id>summary page was silently absent from search results, eventhough it lives on disk and is embedded. That contradicts the README's
positioning of crystallize as the way to make a session's durable parts
permanent and retrievable.
rg "index_page" src/vouch/before this change showed onlyproposals.approve()(PAGE branch) andhealth.py's rebuilder(used by
vouch index) — confirming the FTS5 path had no auto-hook fromstorage. After this change,
sessions.pyjoins them.Closes #60.
What would break for someone with an existing
.vouch/directory?.vouch/directories work unchanged.
until the next
vouch indexrebuild. This is the documentedreconciliation path for a stale
state.dband matches behaviour afterany prior FTS5-only fix.
kb.*MCP/JSONL surface, the bundle format, or theobject model — no VEP required.
Tests
test_crystallize_summary_page_is_fts5_indexedin
tests/test_sessions.py— proposes a claim in a session,crystallizes, asserts
index_db.search(...)returns the summarypage id. Fails on
main(AssertionError), passes on this branch.make checkclean: ruff + mypy + pytest (74 passed, 1 skipped).Notes
PR #43 (
feat/semantic-search-phase-7-context-pack) currently touchessessions.pyfor an unrelatedsession_endrefactor, but does notmodify the
crystallizesummary-page write block — trivial rebase ifboth land.
Summary by CodeRabbit
Bug Fixes
Tests
Documentation