fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction#12401
Conversation
…reserve compaction The TranscriptBuilder accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript still contains the full uncompacted messages, causing "Prompt is too long" errors on the next --resume turn. Fix: - Read the CLI's own session file (~/.claude/projects/<cwd>/*.jsonl) which reflects mid-stream compaction, instead of TranscriptBuilder - Extract _cli_project_dir() helper, refactor cleanup_cli_project_dir - On "Prompt is too long" error with --resume, delete the oversized transcript so the next turn falls back to compression-based context
🔍 PR Overlap DetectionThis check compares your PR against all other open PRs targeting the same branch to detect potential merge conflicts early. 🔴 Merge Conflicts DetectedThe following PRs have been tested and will have merge conflicts if merged after this PR. Consider coordinating with the authors.
🟢 Low Risk — File Overlap OnlyThese PRs touch the same files but different sections (click to expand)
Summary: 3 conflict(s), 0 medium risk, 1 low risk (out of 4 PRs with file overlap) Auto-generated on push. Ignores: |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughService finalization now prefers the CLI session file for transcript uploads, adds a skip flag to avoid duplicate uploads, and deletes oversized transcripts on resume-mode "prompt is too long" streaming errors. Transcript utilities add safe CLI project path resolution and a reader for the latest CLI session file. Changes
Sequence Diagram(s)sequenceDiagram
participant Service as Service
participant Transcript as TranscriptModule
participant FS as CLI_FileSystem
participant Storage as Upload/Storage
Service->>Transcript: finalize(sdk_cwd) → read_cli_session_file(sdk_cwd)
alt CLI session file found
Transcript->>FS: locate newest *.jsonl within resolved project dir
FS-->>Transcript: return file content
Transcript-->>Service: session content
Service->>Storage: upload(session content)
else fallback
Service->>Service: use TranscriptBuilder output
Service->>Storage: upload(builder content)
end
Service->>Storage: streaming process
alt error == "prompt is too long" and resume active
Service->>Transcript: delete_transcript(user_id, session_id)
Service-->>Service: set skip_transcript_upload = true
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1459-1464: The code currently gates transcript uploads on both
config.claude_agent_use_resume and the local variable use_resume, which prevents
new sessions or ones without a stored transcript from ever uploading fresh
transcripts; change the condition so uploads are allowed when
config.claude_agent_use_resume is true (and user_id and session are present)
regardless of use_resume. Specifically, modify the if that checks
(config.claude_agent_use_resume and use_resume and user_id and session is not
None) to remove use_resume from that conjunction (e.g.,
(config.claude_agent_use_resume and user_id and session is not None)); keep
use_resume only for the prompt-too-long suppression logic elsewhere so that
resume-related suppression remains separate from upload permission.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0a7eae5d-17bd-4702-b116-79067bd8b24f
📒 Files selected for processing (2)
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: types
- GitHub Check: Seer Code Review
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
Files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*.{py,txt}
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use
poetry runprefix for all Python commands, including testing, linting, formatting, and migrations
Files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/backend/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings
Files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:10030-10037
Timestamp: 2026-03-01T07:59:02.311Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — For MCP manual token storage, backend model autogpt_platform/backend/backend/api/features/mcp/routes.py defines MCPStoreTokenRequest.token as Pydantic SecretStr with a min length constraint, which generates OpenAPI schema metadata (format: "password", writeOnly: true, minLength: 1) in autogpt_platform/frontend/src/app/api/openapi.json. Prefer SecretStr (with length constraints) for sensitive request fields so generated TS clients and docs treat them as secrets.
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
…sion Use skip_transcript_upload instead of reusing use_resume to gate transcript uploads. use_resume starts False and only becomes True after a successful download, so gating on it prevented first-turn transcripts from ever being uploaded (bootstrap paradox).
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
autogpt_platform/backend/backend/copilot/sdk/service.py (1)
1471-1497:⚠️ Potential issue | 🟡 MinorFallback to
TranscriptBuilderwhen CLI transcript is invalid.At Lines 1492-1497, an invalid CLI session file causes upload to be skipped entirely. That unnecessarily disables
--resumenext turn even though builder output is available.💡 Suggested fallback behavior
cli_transcript = read_cli_session_file(sdk_cwd) if sdk_cwd else None - if cli_transcript: + if cli_transcript and validate_transcript(cli_transcript): transcript_content = cli_transcript logger.info( "%s Using CLI session file for transcript upload " "(%d bytes)", log_prefix, len(cli_transcript), ) else: + if cli_transcript: + logger.warning( + "%s CLI session file invalid, falling back to TranscriptBuilder", + log_prefix, + ) transcript_content = transcript_builder.to_jsonl() logger.info( "%s CLI session file not available, using " "TranscriptBuilder (%d bytes)", log_prefix, len(transcript_content) if transcript_content else 0, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1471 - 1497, The current logic reads cli_transcript via read_cli_session_file and if present uses it unconditionally even if validate_transcript(cli_transcript) would fail, causing upload to be skipped and preventing resume; change the flow in the upload branch so that after reading cli_transcript you call validate_transcript(cli_transcript) and if it fails, log that the CLI session file is invalid and fall back to using transcript_builder.to_jsonl() (use transcript_builder.entry_count for logs), then continue with the existing validate_transcript check on transcript_content before deciding to upload or skip; reference read_cli_session_file, cli_transcript, validate_transcript, and transcript_builder.to_jsonl().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1068-1079: The finally block can re-upload the oversized
transcript if cancellation occurs while awaiting delete_transcript; to fix, set
skip_transcript_upload = True immediately before calling await
delete_transcript(...) (within the same scope where delete_transcript is called)
so that even if cancellation/exception happens during await the flag prevents
re-upload; update the try/except around delete_transcript in the same function
to set skip_transcript_upload prior to awaiting and keep the existing
logger.warning handling for exceptions (referencing delete_transcript,
skip_transcript_upload, and log_prefix).
---
Outside diff comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1471-1497: The current logic reads cli_transcript via
read_cli_session_file and if present uses it unconditionally even if
validate_transcript(cli_transcript) would fail, causing upload to be skipped and
preventing resume; change the flow in the upload branch so that after reading
cli_transcript you call validate_transcript(cli_transcript) and if it fails, log
that the CLI session file is invalid and fall back to using
transcript_builder.to_jsonl() (use transcript_builder.entry_count for logs),
then continue with the existing validate_transcript check on transcript_content
before deciding to upload or skip; reference read_cli_session_file,
cli_transcript, validate_transcript, and transcript_builder.to_jsonl().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9639f402-d1de-4b72-bc03-83b3a9ddad53
📒 Files selected for processing (1)
autogpt_platform/backend/backend/copilot/sdk/service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: types
- GitHub Check: Seer Code Review
- GitHub Check: Check PR Status
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.13)
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/**/*.{py,txt}
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use
poetry runprefix for all Python commands, including testing, linting, formatting, and migrations
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
📚 Learning: 2026-03-10T08:39:13.707Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
🔇 Additional comments (1)
autogpt_platform/backend/backend/copilot/sdk/service.py (1)
711-711: Good separation of resume state vs upload suppression.This cleanly avoids the bootstrap paradox: transcript upload is no longer gated by
use_resume, and suppression is isolated toskip_transcript_upload.Also applies to: 1460-1465
Move flag assignment before the await to prevent re-upload if cancellation interrupts delete_transcript.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
autogpt_platform/backend/backend/copilot/sdk/service.py (1)
1492-1503: Consider logging actual entry count when CLI transcript is used.When the CLI session file is used,
transcript_builder.entry_count(lines 1496, 1502) doesn't reflect the actual entry count of the uploaded content. This could be misleading during debugging.💡 Optional fix to improve logging accuracy
+ # Count entries for logging (works for both CLI and builder sources) + entry_count = len(transcript_content.strip().split("\n")) if transcript_content else 0 + if not transcript_content: logger.warning( "%s No transcript to upload (builder empty)", log_prefix ) elif not validate_transcript(transcript_content): logger.warning( "%s Transcript invalid, skipping upload (entries=%d)", log_prefix, - transcript_builder.entry_count, + entry_count, ) else: logger.info( "%s Uploading complete transcript (entries=%d, bytes=%d)", log_prefix, - transcript_builder.entry_count, + entry_count, len(transcript_content), )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1492 - 1503, The logs use transcript_builder.entry_count which can be stale when transcript_content is sourced from a CLI session file; update the code around validate_transcript/transcript_content to compute the real entry count from transcript_content (e.g., parse transcript_content into entries or split into lines/JSON entries to get actual_entry_count) and use that actual_entry_count in the logger.warning and logger.info calls instead of transcript_builder.entry_count; reference symbols: transcript_content, transcript_builder.entry_count, validate_transcript, logger.warning, logger.info.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1492-1503: The logs use transcript_builder.entry_count which can
be stale when transcript_content is sourced from a CLI session file; update the
code around validate_transcript/transcript_content to compute the real entry
count from transcript_content (e.g., parse transcript_content into entries or
split into lines/JSON entries to get actual_entry_count) and use that
actual_entry_count in the logger.warning and logger.info calls instead of
transcript_builder.entry_count; reference symbols: transcript_content,
transcript_builder.entry_count, validate_transcript, logger.warning,
logger.info.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fdf7d595-0544-42f1-ab58-50c30360b54f
📒 Files selected for processing (1)
autogpt_platform/backend/backend/copilot/sdk/service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: types
- GitHub Check: Seer Code Review
- GitHub Check: test (3.13)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/**/*.{py,txt}
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use
poetry runprefix for all Python commands, including testing, linting, formatting, and migrations
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/copilot/sdk/service.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
📚 Learning: 2026-03-10T08:39:13.707Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-10T08:38:30.834Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:30.834Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T23:58:18.476Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/service.py
🔇 Additional comments (4)
autogpt_platform/backend/backend/copilot/sdk/service.py (4)
77-86: LGTM!The new imports for
delete_transcriptandread_cli_session_fileare correctly added and align with the transcript module's public API as shown in the relevant code snippets.
711-711: LGTM!The
skip_transcript_uploadflag is correctly initialized. This addresses the previous review feedback about using a dedicated flag rather than reusinguse_resumefor controlling transcript uploads.
1051-1079: LGTM!The error handling for "prompt is too long" is well-implemented:
- The
skip_transcript_upload = Trueis correctly set before awaitingdelete_transcript, addressing the previous review feedback about preventing re-upload on cancellation.- The detection logic using
"prompt is too long" in err_str.lower()is reasonable for catching the relevant API error.- The
delete_transcriptcall is wrapped in try/except to ensure deletion failures don't mask the original stream error.
1460-1486: LGTM — CLI session file preference correctly implemented.The upload condition change addresses the previous review feedback: new sessions and sessions without stored transcripts will now upload correctly since
use_resumeis no longer in the gate condition.The preference for CLI session file over TranscriptBuilder output is correct per the PR objectives — the CLI file reflects mid-stream compaction.
…l protection - Replace `import glob` with `pathlib.Path.glob()` in `read_cli_session_file` - Add symlink path traversal validation on glob results using `is_relative_to` - Add unit tests for `read_cli_session_file` and `_cli_project_dir`
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/transcript.py`:
- Around line 171-194: The scan currently can raise from p.resolve() or
os.path.getmtime causing read_cli_session_file to propagate errors; instead
iterate Path(project_dir).glob("*.jsonl") and for each candidate call
p.resolve() and p.resolve().is_relative_to(resolved_base) inside a try/except
that logs and skips problematic entries, collect safe paths and their mtimes
with os.path.getmtime guarded in try/except as well (e.g., build a list of
(path, mtime) only when both resolve and getmtime succeed), then pick the
max-safe entry as session_file; also ensure the file-open/read OSError already
caught returns None so no exception escapes. Use the existing symbols:
read_cli_session_file, resolved_base, jsonl_files/session_file, Path.resolve,
and os.path.getmtime when implementing these guarded checks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 455b26ad-7246-444f-8e70-a170ef4661b2
📒 Files selected for processing (2)
autogpt_platform/backend/backend/copilot/sdk/transcript.pyautogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: types
- GitHub Check: Seer Code Review
- GitHub Check: test (3.11)
- GitHub Check: test (3.13)
- GitHub Check: test (3.12)
- GitHub Check: Analyze (typescript)
- GitHub Check: Analyze (python)
- GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (6)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
Files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*.{py,txt}
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use
poetry runprefix for all Python commands, including testing, linting, formatting, and migrations
Files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*_test.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
autogpt_platform/backend/**/*_test.py: Always review snapshot changes withgit diffbefore committing when updating snapshots withpoetry run pytest --snapshot-update
Use pytest with snapshot testing for API responses in test files
Colocate test files with source files using the*_test.pynaming convention
Files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings
Files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*test*.py
📄 CodeRabbit inference engine (AGENTS.md)
Run
poetry run testfor backend testing (runs pytest with docker based postgres + prisma)
Files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
📚 Learning: 2026-02-04T16:49:42.490Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/**/test/**/*.py : Use snapshot testing with '--snapshot-update' flag in backend tests when output changes; always review with 'git diff'
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📚 Learning: 2026-02-04T16:50:20.508Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Always review snapshot changes with `git diff` before committing when updating snapshots with `poetry run pytest --snapshot-update`
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📚 Learning: 2026-02-04T16:50:20.508Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Colocate test files with source files using the `*_test.py` naming convention
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.
Applied to files:
autogpt_platform/backend/backend/copilot/sdk/transcript_test.pyautogpt_platform/backend/backend/copilot/sdk/transcript.py
🔇 Additional comments (2)
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py (1)
292-371: Good coverage for the new CLI transcript safety paths.These tests validate the key security/correctness paths (
Noneon missing files, happy-path read, symlink escape rejection, and_cli_project_dirtraversal guard) and match the implementation intent.autogpt_platform/backend/backend/copilot/sdk/transcript.py (1)
141-155: Nice hardening on project-dir containment and cleanup reuse.The
_cli_project_direscape check plus reuse insidecleanup_cli_project_diris a clean security-focused refactor.Also applies to: 198-209
…n_file Wrap p.resolve() and stat() calls in try/except to prevent unhandled OSError/RuntimeError from propagating and silently dropping transcript uploads when the caller falls back to TranscriptBuilder.
|
Addressed all 4 review items in 8f0f6ce:
Also: preserved |
majdyz
left a comment
There was a problem hiding this comment.
🤖 Critical Review — PR #12401
Thorough line-by-line review of the transcript compaction sync logic. 12 findings posted as inline comments.
Summary of HIGH severity issues:
compaction_just_endedproperty with side effects — mutating@propertycreates TOCTOU race and fragile ordering dependencies- First-compaction-summary bug —
read_compacted_entriesfinds the FIRSTisCompactSummarybut should find the LAST (multi-compaction scenario) - Race between
emit_end_if_readyandcompaction_just_ended—asyncio.sleep(0)yield creates window for state corruption
Not inline (lines not in diff):
- LOW (
transcript.py:87):strip_progress_entriesstripstype: "summary"entries but does NOT have theisCompactSummaryguard. After compaction + upload, the compacted summary is stripped and next turn's--resumegets incomplete transcript. Add:if entry.get("type", "") in STRIPPABLE_TYPES and not entry.get("isCompactSummary") - LOW (
compaction.py:225):emit_start_if_readychecksnot self._donebutcompaction_just_endedresets_doneas a side effect. Reading the property beforeemit_start_if_readyinadvertently unblocks start emission — implicit ordering dependency.
majdyz
left a comment
There was a problem hiding this comment.
🤖 Critical Review — PR #12401
Thorough review of the transcript compaction sync logic. 10 findings below.
Not inline (lines not in diff):
- LOW (transcript.py:87):
strip_progress_entriesstripstype: "summary"but does NOT guardisCompactSummary. After compaction + upload, the compacted summary is stripped and next turn's --resume gets an incomplete transcript. - LOW (compaction.py:225):
emit_start_if_readychecksnot self._donebutcompaction_just_endedresets_doneas a side effect. Reading the property beforeemit_start_if_readyinadvertently unblocks start emission.
… last-summary, dedup helpers - Replace side-effect @Property with CompactionResult dataclass for TOCTOU safety - Use build-then-swap in replace_entries to prevent data loss on corrupt input - Fix read_compacted_entries to use LAST isCompactSummary (not first) - Guard isCompactSummary in strip_progress_entries and TranscriptBuilder - Extract _projects_base(), _build_path_from_parts(), _build_meta_storage_path() - Extract TranscriptBuilder._parse_entry() static method - Sanitize transcript_path in pre_compact_hook before logging - Update compaction_test.py and transcript_test.py for new behaviors
|
Addressed all remaining review items in 1023134: Architecture:
Bug fixes: Dedup / code quality: Tests: |
Add isCompactSummary field to TranscriptEntry model so compaction summaries survive the export→load_previous roundtrip. Without this, exported summaries with type="summary" were stripped on reload since "summary" is in STRIPPABLE_TYPES. Also add integration tests simulating the full compaction flow (load → append → compact → replace → export → reload).
…stream compaction When compaction ends, replace_entries loads the CLI session file which already contains the current message. Skip append_assistant and append_tool_result for that iteration to avoid duplicates that could cause 'prompt is too long' errors on subsequent turns.
Exercises the full service.py compaction flow end-to-end: TranscriptBuilder load → CompactionTracker state machine → read_compacted_entries → replace_entries → export → roundtrip.
…ompaction (#12401) ## Summary - **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next `--resume` turn. - **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and call `TranscriptBuilder.replace_entries()` to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees. - **Key changes**: - `CompactionTracker`: stores `transcript_path` from `PreCompact` hook, one-shot `compaction_just_ended` flag that correctly resets for multiple compactions - `read_compacted_entries()`: reads CLI session JSONL, finds `isCompactSummary: true` entry, returns it + all entries after. Includes path validation against the CLI projects directory. - `TranscriptBuilder.replace_entries()`: clears and replaces all entries with compacted ones, preserving `isCompactSummary` entries (which have `type: "summary"` that would normally be stripped) - `load_previous()`: also preserves `isCompactSummary` entries when loading a previously compacted transcript - Service stream loop: after compaction ends, reads compacted entries and syncs TranscriptBuilder ## Test plan - [x] 69 tests pass across `compaction_test.py` and `transcript_test.py` - [x] Tests cover: one-shot flag behavior, multiple compactions within a query, transcript path storage, path traversal rejection, `read_compacted_entries` (7 tests), `replace_entries` (4 tests), `load_previous` with compacted content (2 tests) - [x] Pre-commit hooks pass (lint, format, typecheck) - [ ] Manual test: trigger compaction in a multi-turn session and verify the uploaded transcript reflects compaction
…ompaction (Significant-Gravitas#12401) ## Summary - **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next `--resume` turn. - **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and call `TranscriptBuilder.replace_entries()` to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees. - **Key changes**: - `CompactionTracker`: stores `transcript_path` from `PreCompact` hook, one-shot `compaction_just_ended` flag that correctly resets for multiple compactions - `read_compacted_entries()`: reads CLI session JSONL, finds `isCompactSummary: true` entry, returns it + all entries after. Includes path validation against the CLI projects directory. - `TranscriptBuilder.replace_entries()`: clears and replaces all entries with compacted ones, preserving `isCompactSummary` entries (which have `type: "summary"` that would normally be stripped) - `load_previous()`: also preserves `isCompactSummary` entries when loading a previously compacted transcript - Service stream loop: after compaction ends, reads compacted entries and syncs TranscriptBuilder ## Test plan - [x] 69 tests pass across `compaction_test.py` and `transcript_test.py` - [x] Tests cover: one-shot flag behavior, multiple compactions within a query, transcript path storage, path traversal rejection, `read_compacted_entries` (7 tests), `replace_entries` (4 tests), `load_previous` with compacted content (2 tests) - [x] Pre-commit hooks pass (lint, format, typecheck) - [ ] Manual test: trigger compaction in a multi-turn session and verify the uploaded transcript reflects compaction
Summary
TranscriptBuilderaccumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next--resumeturn.PreCompacthook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and callTranscriptBuilder.replace_entries()to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees.CompactionTracker: storestranscript_pathfromPreCompacthook, one-shotcompaction_just_endedflag that correctly resets for multiple compactionsread_compacted_entries(): reads CLI session JSONL, findsisCompactSummary: trueentry, returns it + all entries after. Includes path validation against the CLI projects directory.TranscriptBuilder.replace_entries(): clears and replaces all entries with compacted ones, preservingisCompactSummaryentries (which havetype: "summary"that would normally be stripped)load_previous(): also preservesisCompactSummaryentries when loading a previously compacted transcriptTest plan
compaction_test.pyandtranscript_test.pyread_compacted_entries(7 tests),replace_entries(4 tests),load_previouswith compacted content (2 tests)