Skip to content

feat(opencode): add turn-level transcript recall#506

Open
InSciTech wants to merge 2 commits intozilliztech:mainfrom
InSciTech:opencode-turn-level-recall
Open

feat(opencode): add turn-level transcript recall#506
InSciTech wants to merge 2 commits intozilliztech:mainfrom
InSciTech:opencode-turn-level-recall

Conversation

@InSciTech
Copy link
Copy Markdown

@InSciTech InSciTech commented Apr 25, 2026

Closes #505

Summary

  • add turn anchors and turn_id/context transcript drill-down for the OpenCode plugin
  • keep transcript recall sourced from the OpenCode SQLite DB while treating .memsearch/opencode-turns.db as derived capture state for checkpoints and stable replay ordering
  • clarify markdown-first replay semantics and add regression coverage for partial sidecar failures

Test Plan

  • uv run python -m pytest tests/test_opencode_turns.py tests/test_opencode_parse_transcript.py -v
  • uv run ruff check tests/test_opencode_turns.py tests/test_opencode_parse_transcript.py plugins/opencode/scripts/parse-transcript.py plugins/opencode/scripts/opencode_turns.py plugins/opencode/scripts/capture-daemon.py

@zc277584121
Copy link
Copy Markdown
Collaborator

Thanks for putting this together. The issue makes sense to me: OpenCode L3 recall currently only has session-level drill-down, so long sessions need a stable turn cursor.

I have one design question before merging this: is .memsearch/opencode-turns.db intended to be required for turn-level transcript recall, or primarily a derived capture checkpoint/cache?

From reading the PR, the minimal L3 path seems to be:

  • write turn:<user_message_id> into the markdown anchor during capture
  • have memory_transcript(session_id, turn_id, context) rebuild turns from the OpenCode SQLite DB on demand
  • use the target turn id to return surrounding context

That would solve the transcript drill-down problem without introducing another sidecar DB. The sidecar DB does seem useful for replacing .last_msg_time with a more robust per-session/per-turn capture checkpoint, but that is a bigger state-management change.

If we keep the sidecar DB in this PR, I think we should make a few things explicit:

  • document that .memsearch/opencode-turns.db is derived state only, not memory source of truth; it should be safe to delete/rebuild
  • define what happens if sidecar state and markdown disagree, e.g. markdown write succeeds but state update fails, or state update succeeds but markdown write fails
  • add/confirm tests for those recovery paths
  • fix the current ruff issue in tests/test_opencode_turns.py (SIM105)
  • check whether the mode changes from 100755 to 100644 for capture-daemon.py and parse-transcript.py are intentional; current TS calls use python3, so it may be harmless, but it looks like unnecessary metadata churn unless direct execution is no longer supported

Overall I like the direction, but I would prefer we treat the sidecar DB as a deliberate capture-checkpoint design change, not just an implementation detail of memory_transcript.

Document OpenCode SQLite as the source of truth for transcript recall and add replay regression coverage so sidecar failures stay recoverable without duplicate markdown writes.
@InSciTech
Copy link
Copy Markdown
Author

InSciTech commented Apr 27, 2026

You're reading the branch correctly: .memsearch/opencode-turns.db is not required for memory_transcript.

On the design question: transcript reads still rebuild turns from the OpenCode SQLite database on demand. The sidecar is derived capture state only: replay-safe checkpoints and stable turn ordering for the daemon. I also updated the PR description so it no longer frames the sidecar as part of the transcript read path.

On the recovery-path point: the current write order is markdown append -> save_turn() -> save_turn_state(). So the meaningful replay cases here are:

  • markdown write succeeded, save_turn() failed
  • markdown write succeeded, save_turn_state() failed
  • markdown write failed before any sidecar state advanced

This update adds or confirms coverage for those paths:

  • test_capture_session_turns_repairs_sidecar_after_partial_turn_save_failure
  • test_capture_session_turns_is_idempotent_after_partial_state_save_failure
  • test_capture_session_turns_does_not_advance_sidecar_when_markdown_write_fails

On the source-of-truth point: I clarified in code and docs that:

  • OpenCode SQLite is the source of truth for transcript recall
  • .memsearch/opencode-turns.db is rebuildable derived state only
  • replay dedupes through existing session + turn_id markdown anchors

I also added test_parse_transcript_reads_from_opencode_sqlite_without_sidecar to make the non-dependence of transcript reads on the sidecar explicit.

On the other review items:

  • fixed the SIM105 lint issue in tests/test_opencode_turns.py
  • restored the unintentional 100755 script modes for capture-daemon.py and parse-transcript.py

If you'd still prefer the sidecar/checkpoint work to be split after this clarification, I can do that next.

@zc277584121
Copy link
Copy Markdown
Collaborator

Thanks for the update. The clarification that transcript reads still rebuild from the OpenCode SQLite DB, and that opencode-turns.db is derived capture state only, addresses my main design concern. I also pulled the latest branch locally and the focused pytest/ruff checks pass for me.

I did one more pass and found two edge cases I think are worth handling before merge:

  1. Upgrade / migration from the existing .last_msg_time state

This PR switches capture progress to the new sidecar state. For an existing user who already has .memsearch/.last_msg_time and old markdown anchors like:

<!-- session:... db:... -->

but no .memsearch/opencode-turns.db yet, the new daemon appears to start with an empty turn_state, so after_time is None. Since capture_exists() only dedupes new session + turn anchors, old captured entries without turn: would not be recognized as already captured.

That could cause an upgrade to replay/re-summarize recent OpenCode session history and append duplicate memory entries. Could you add either a small migration path from .last_msg_time, or an explicit first-run behavior/test that avoids duplicate capture for existing installs?

  1. Textless assistant/tool-call messages in the parent chain

build_turns() currently skips any message whose rendered text is empty before adding its id to current_message_ids. That can break descendant grouping if OpenCode stores an intermediate assistant/tool-call message with no text but a later assistant message points to it via parentID.

For example:

u1 user text
  a1 assistant, finish=tool-calls, no text
    a2 assistant, parentID=a1, text="final answer"

In this shape, a1 is skipped, so a2's parent is not in current_message_ids and the turn remains incomplete. I think the turn builder should preserve structural assistant message ids even when their rendered text is empty, or otherwise special-case this parent-chain scenario. A regression test for this would be useful.

Other than those two points, the updated direction looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[OpenCode Plugin] Add turn-level cursor to memory_transcript for precise L3 retrieval

2 participants