Skip to content

Fix: index Cursor sessions from agent-transcripts/, support nested depths#752

Open
kix-sisigai wants to merge 1 commit into
siteboon:mainfrom
kix-sisigai:fix/cursor-synchronizer-agent-transcripts
Open

Fix: index Cursor sessions from agent-transcripts/, support nested depths#752
kix-sisigai wants to merge 1 commit into
siteboon:mainfrom
kix-sisigai:fix/cursor-synchronizer-agent-transcripts

Conversation

@kix-sisigai
Copy link
Copy Markdown

@kix-sisigai kix-sisigai commented May 8, 2026

Why

Recent cursor-agent versions write JSONL transcripts under ~/.cursor/projects/<project-dir>/agent-transcripts/<chatId>/.... The legacy ~/.cursor/chats/<projectHash>/ directory still exists but now holds only SQLite store.db files, used by the loader (cursor-sessions.provider.ts), not JSONL the indexer can parse. Result: on main, CursorSessionSynchronizer finds zero transcripts for any current Cursor user — every initial scan returns 0, advances scan_state.last_scanned_at, and from then on the file watcher only catches whatever brand-new chats happen after startup.

The transcript layout itself isn't fixed either — the same install can have chats at both agent-transcripts/<chatId>/<chatId>.jsonl and agent-transcripts/<chatId>/<sub>/<chatId>.jsonl, so the hard-coded path.dirname(path.dirname(filePath)) silently skipped the deeper variant.

Changes

  • Switch the cursor synchronizer to scan ~/.cursor/projects/<name>/agent-transcripts/ for transcripts; the legacy hashed chats/ dir is left to the loader, which already has its own md5 hashing.
  • Locate worker.log by walking up from the transcript file, bounded to ~/.cursor/projects, so both nested depths work and a transcript outside the cursor home returns null instead of asserting on path arithmetic.
  • Pass the already-resolved projectPath from the project loop into processSessionFile to avoid redundant per-file resolution; the watcher entry point still resolves it via the walk-up helper.
  • Drop the now-dead md5 helper and crypto import from the synchronizer.
  • Add server/modules/providers/tests/cursor-session-synchronizer.test.ts covering: transcripts at both depths, a project missing worker.log, the legacy chats/<hash>/store.db presence (must not be claimed by the indexer), watcher-added transcripts, transcripts outside ~/.cursor/projects, and the incremental since filter.

Testing

  1. `TSX_TSCONFIG_PATH=server/tsconfig.json node --import tsx --test server/modules/providers/tests/cursor-session-synchronizer.test.ts` → 1/1 pass.
  2. Existing `server/modules/providers/tests/mcp.test.ts` → 4/4 still pass.
  3. `tsc --noEmit -p server/tsconfig.json` → clean.
  4. Smoke-tested against a real `~/.cursor` with 109 jsonls across two projects: pre-fix indexed 1 (the one chat that was modified after CloudCLI started, picked up by the watcher); post-fix initial scan indexes 62 — every transcript whose project has a parseable `worker.log`.

Summary by CodeRabbit

  • Refactor

    • Updated transcript discovery mechanism to use location-aware detection instead of hash-based lookup, improving reliability across varying project structures and directory layouts.
  • Tests

    • Added comprehensive test coverage for transcript indexing behavior, including edge cases and multiple directory structure scenarios.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

CursorSessionSynchronizer migrates from MD5-based project identification with fixed directory paths to dynamic path-based discovery using worker.log contents. Transcripts now index from ~/.cursor/projects/<project>/agent-transcripts supporting variable nesting depths. A new helper resolves project directories by walking upward from transcript files, and processSessionFile now accepts optional project hints or resolves dynamically. Comprehensive integration tests validate multi-depth discovery, edge cases, and incremental scanning.

Changes

Cursor Transcript Indexing - Path-Based Project Resolution

Layer / File(s) Summary
Dependency Updates
server/modules/providers/list/cursor/cursor-session-synchronizer.provider.ts
Removes unused node:crypto import following elimination of MD5-based project hashing.
Project Path Resolution Helper
server/modules/providers/list/cursor/cursor-session-synchronizer.provider.ts
New findProjectDirForTranscript helper walks upward from transcript file paths to locate projects/*/worker.log, enabling variable nesting depth support for agent-transcripts directories.
Synchronize Method - Project and Transcript Discovery
server/modules/providers/list/cursor/cursor-session-synchronizer.provider.ts
Reworked synchronize to iterate ~/.cursor/projects/*, extract projectPath from each project's worker.log, deduplicate by path, then discover and process .jsonl files under agent-transcripts at variable nesting depths.
Session File Processing with Dynamic Resolution
server/modules/providers/list/cursor/cursor-session-synchronizer.provider.ts
processSessionFile now accepts optional projectPathHint; when absent, uses findProjectDirForTranscript to resolve project identity and extract projectPath from worker.log dynamically.
Test Infrastructure and Helpers
server/modules/providers/tests/cursor-session-synchronizer.test.ts
Test harness establishes isolated temp directories, overrides DATABASE_PATH and os.homedir(), and provides utilities for writing JSONL transcripts and constructing Cursor payload rows.
Comprehensive Test Validation
server/modules/providers/tests/cursor-session-synchronizer.test.ts
Integration test validates transcript indexing across shallow/deep nesting, correct session metadata in database, ignored projects without worker.log, unindexed legacy ~/.cursor/chats/ directories, per-file project path resolution, rejection of transcripts outside Cursor home, and incremental scanning with future timestamps.

Suggested reviewers

  • viper151

🐰 Whiskers twitching with glee,
Hashes fade, paths now run free—
From chats deep in the trees,
To worker logs with ease,
Projects found where they ought to be!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main functional change: Cursor sessions are now indexed from agent-transcripts/ directories with support for nested directory depths.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/modules/providers/tests/cursor-session-synchronizer.test.ts (1)

18-24: ⚡ Quick win

patchHomeDir uses a global os module mutation that can bleed across parallel test files.

{ concurrency: false } only serializes tests within this file; the Node test runner runs different test files in parallel by default. If any other test file instantiates CursorSessionSynchronizer (or calls os.homedir()) while this test is active, it will observe the patched value and fail with a spurious path.

Consider making the home directory injectable via the constructor instead, which eliminates the need for monkey-patching entirely:

♻️ Proposed refactor (constructor injection)

In the provider:

 export class CursorSessionSynchronizer implements IProviderSessionSynchronizer {
   private readonly provider = 'cursor' as const;
-  private readonly cursorHome = path.join(os.homedir(), '.cursor');
+  private readonly cursorHome: string;
+
+  constructor(cursorHome = path.join(os.homedir(), '.cursor')) {
+    this.cursorHome = cursorHome;
+  }

In the test:

-  const restoreHomeDir = patchHomeDir(tempRoot);
-  // ...
+  const cursorHome = path.join(tempRoot, '.cursor');
   const sync = new CursorSessionSynchronizer();
+  // pass cursorHome directly, no global patch needed
-  restoreHomeDir();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/modules/providers/tests/cursor-session-synchronizer.test.ts` around
lines 18 - 24, The test's global os.homedir monkeypatch (patchHomeDir) can leak
across test files—avoid it by making the home directory injectable: add an
optional homeDir (or basePath) parameter to the CursorSessionSynchronizer
constructor and switch any internal uses of os.homedir() in that class to use
this injected value (falling back to os.homedir() when the parameter is
undefined); then update the tests to create the synchronizer with the temp/home
path instead of calling patchHomeDir and remove patchHomeDir/any os.homedir
mutations from the test file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@server/modules/providers/tests/cursor-session-synchronizer.test.ts`:
- Around line 18-24: The test's global os.homedir monkeypatch (patchHomeDir) can
leak across test files—avoid it by making the home directory injectable: add an
optional homeDir (or basePath) parameter to the CursorSessionSynchronizer
constructor and switch any internal uses of os.homedir() in that class to use
this injected value (falling back to os.homedir() when the parameter is
undefined); then update the tests to create the synchronizer with the temp/home
path instead of calling patchHomeDir and remove patchHomeDir/any os.homedir
mutations from the test file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9383800f-0752-4159-9738-f693eeeffaf2

📥 Commits

Reviewing files that changed from the base of the PR and between beb0a50 and 30e6f9e.

📒 Files selected for processing (2)
  • server/modules/providers/list/cursor/cursor-session-synchronizer.provider.ts
  • server/modules/providers/tests/cursor-session-synchronizer.test.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant