feat(opencode): add turn-level transcript recall by InSciTech · Pull Request #506 · zilliztech/memsearch

InSciTech · 2026-04-25T12:24:04Z

Closes #505

Summary

add turn anchors and turn_id/context transcript drill-down for the OpenCode plugin
keep transcript recall sourced from the OpenCode SQLite DB while treating .memsearch/opencode-turns.db as derived capture state for checkpoints and stable replay ordering
clarify markdown-first replay semantics and add regression coverage for partial sidecar failures

Test Plan

uv run python -m pytest tests/test_opencode_turns.py tests/test_opencode_parse_transcript.py -v
uv run ruff check tests/test_opencode_turns.py tests/test_opencode_parse_transcript.py plugins/opencode/scripts/parse-transcript.py plugins/opencode/scripts/opencode_turns.py plugins/opencode/scripts/capture-daemon.py

zc277584121 · 2026-04-27T08:25:48Z

Thanks for putting this together. The issue makes sense to me: OpenCode L3 recall currently only has session-level drill-down, so long sessions need a stable turn cursor.

I have one design question before merging this: is .memsearch/opencode-turns.db intended to be required for turn-level transcript recall, or primarily a derived capture checkpoint/cache?

From reading the PR, the minimal L3 path seems to be:

write turn:<user_message_id> into the markdown anchor during capture
have memory_transcript(session_id, turn_id, context) rebuild turns from the OpenCode SQLite DB on demand
use the target turn id to return surrounding context

That would solve the transcript drill-down problem without introducing another sidecar DB. The sidecar DB does seem useful for replacing .last_msg_time with a more robust per-session/per-turn capture checkpoint, but that is a bigger state-management change.

If we keep the sidecar DB in this PR, I think we should make a few things explicit:

document that .memsearch/opencode-turns.db is derived state only, not memory source of truth; it should be safe to delete/rebuild
define what happens if sidecar state and markdown disagree, e.g. markdown write succeeds but state update fails, or state update succeeds but markdown write fails
add/confirm tests for those recovery paths
fix the current ruff issue in tests/test_opencode_turns.py (SIM105)
check whether the mode changes from 100755 to 100644 for capture-daemon.py and parse-transcript.py are intentional; current TS calls use python3, so it may be harmless, but it looks like unnecessary metadata churn unless direct execution is no longer supported

Overall I like the direction, but I would prefer we treat the sidecar DB as a deliberate capture-checkpoint design change, not just an implementation detail of memory_transcript.

Document OpenCode SQLite as the source of truth for transcript recall and add replay regression coverage so sidecar failures stay recoverable without duplicate markdown writes.

InSciTech · 2026-04-27T10:15:19Z

You're reading the branch correctly: .memsearch/opencode-turns.db is not required for memory_transcript.

On the design question: transcript reads still rebuild turns from the OpenCode SQLite database on demand. The sidecar is derived capture state only: replay-safe checkpoints and stable turn ordering for the daemon. I also updated the PR description so it no longer frames the sidecar as part of the transcript read path.

On the recovery-path point: the current write order is markdown append -> save_turn() -> save_turn_state(). So the meaningful replay cases here are:

markdown write succeeded, save_turn() failed
markdown write succeeded, save_turn_state() failed
markdown write failed before any sidecar state advanced

This update adds or confirms coverage for those paths:

test_capture_session_turns_repairs_sidecar_after_partial_turn_save_failure
test_capture_session_turns_is_idempotent_after_partial_state_save_failure
test_capture_session_turns_does_not_advance_sidecar_when_markdown_write_fails

On the source-of-truth point: I clarified in code and docs that:

OpenCode SQLite is the source of truth for transcript recall
.memsearch/opencode-turns.db is rebuildable derived state only
replay dedupes through existing session + turn_id markdown anchors

I also added test_parse_transcript_reads_from_opencode_sqlite_without_sidecar to make the non-dependence of transcript reads on the sidecar explicit.

On the other review items:

fixed the SIM105 lint issue in tests/test_opencode_turns.py
restored the unintentional 100755 script modes for capture-daemon.py and parse-transcript.py

If you'd still prefer the sidecar/checkpoint work to be split after this clarification, I can do that next.

zc277584121 · 2026-04-28T02:49:08Z

Thanks for the update. The clarification that transcript reads still rebuild from the OpenCode SQLite DB, and that opencode-turns.db is derived capture state only, addresses my main design concern. I also pulled the latest branch locally and the focused pytest/ruff checks pass for me.

I did one more pass and found two edge cases I think are worth handling before merge:

Upgrade / migration from the existing .last_msg_time state

This PR switches capture progress to the new sidecar state. For an existing user who already has .memsearch/.last_msg_time and old markdown anchors like:

<!-- session:... db:... -->

but no .memsearch/opencode-turns.db yet, the new daemon appears to start with an empty turn_state, so after_time is None. Since capture_exists() only dedupes new session + turn anchors, old captured entries without turn: would not be recognized as already captured.

That could cause an upgrade to replay/re-summarize recent OpenCode session history and append duplicate memory entries. Could you add either a small migration path from .last_msg_time, or an explicit first-run behavior/test that avoids duplicate capture for existing installs?

Textless assistant/tool-call messages in the parent chain

build_turns() currently skips any message whose rendered text is empty before adding its id to current_message_ids. That can break descendant grouping if OpenCode stores an intermediate assistant/tool-call message with no text but a later assistant message points to it via parentID.

For example:

u1 user text
  a1 assistant, finish=tool-calls, no text
    a2 assistant, parentID=a1, text="final answer"

In this shape, a1 is skipped, so a2's parent is not in current_message_ids and the turn remains incomplete. I think the turn builder should preserve structural assistant message ids even when their rendered text is empty, or otherwise special-case this parent-chain scenario. A regression test for this would be useful.

Other than those two points, the updated direction looks good to me.

feat(opencode): add turn-level transcript recall

cb49ead

InSciTech mentioned this pull request Apr 25, 2026

[OpenCode Plugin] Add turn-level cursor to memory_transcript for precise L3 retrieval #505

Open

fix(opencode): clarify sidecar transcript boundaries

59b3bc1

Document OpenCode SQLite as the source of truth for transcript recall and add replay regression coverage so sidecar failures stay recoverable without duplicate markdown writes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(opencode): add turn-level transcript recall#506

feat(opencode): add turn-level transcript recall#506
InSciTech wants to merge 2 commits intozilliztech:mainfrom
InSciTech:opencode-turn-level-recall

InSciTech commented Apr 25, 2026 •

edited

Loading

Uh oh!

zc277584121 commented Apr 27, 2026

Uh oh!

InSciTech commented Apr 27, 2026 •

edited

Loading

Uh oh!

zc277584121 commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

InSciTech commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

zc277584121 commented Apr 27, 2026

Uh oh!

InSciTech commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zc277584121 commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

InSciTech commented Apr 25, 2026 •

edited

Loading

InSciTech commented Apr 27, 2026 •

edited

Loading