fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction by majdyz · Pull Request #12401 · Significant-Gravitas/AutoGPT

majdyz · 2026-03-13T09:53:44Z

Summary

Root cause: TranscriptBuilder accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next --resume turn.
Fix: Detect mid-stream compaction via the PreCompact hook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and call TranscriptBuilder.replace_entries() to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees.
Key changes:
- CompactionTracker: stores transcript_path from PreCompact hook, one-shot compaction_just_ended flag that correctly resets for multiple compactions
- read_compacted_entries(): reads CLI session JSONL, finds isCompactSummary: true entry, returns it + all entries after. Includes path validation against the CLI projects directory.
- TranscriptBuilder.replace_entries(): clears and replaces all entries with compacted ones, preserving isCompactSummary entries (which have type: "summary" that would normally be stripped)
- load_previous(): also preserves isCompactSummary entries when loading a previously compacted transcript
- Service stream loop: after compaction ends, reads compacted entries and syncs TranscriptBuilder

Test plan

69 tests pass across compaction_test.py and transcript_test.py
Tests cover: one-shot flag behavior, multiple compactions within a query, transcript path storage, path traversal rejection, read_compacted_entries (7 tests), replace_entries (4 tests), load_previous with compacted content (2 tests)
Pre-commit hooks pass (lint, format, typecheck)
Manual test: trigger compaction in a multi-turn session and verify the uploaded transcript reflects compaction

…reserve compaction The TranscriptBuilder accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript still contains the full uncompacted messages, causing "Prompt is too long" errors on the next --resume turn. Fix: - Read the CLI's own session file (~/.claude/projects/<cwd>/*.jsonl) which reflects mid-stream compaction, instead of TranscriptBuilder - Extract _cli_project_dir() helper, refactor cleanup_cli_project_dir - On "Prompt is too long" error with --resume, delete the oversized transcript so the next turn falls back to compression-based context

github-actions · 2026-03-13T09:54:21Z

🔍 PR Overlap Detection

This check compares your PR against all other open PRs targeting the same branch to detect potential merge conflicts early.

🔴 Merge Conflicts Detected

The following PRs have been tested and will have merge conflicts if merged after this PR. Consider coordinating with the authors.

fix(backend/copilot): fix tool-result file read failures across turns #12399 (majdyz · updated 3h ago)
- 📁 autogpt_platform/backend/backend/copilot/sdk/
  - transcript.py (1 conflict, ~170 lines)
fix(backend): split CamelCase block names and filter disabled blocks before batch slicing #12400 (majdyz · updated 3h ago)
- 📁 autogpt_platform/backend/backend/copilot/sdk/
  - service.py (1 conflict, ~37 lines)
  - transcript.py (1 conflict, ~15 lines)
feat(platform): CoPilot credit charging, token rate limiting, and usage UI #12385 (majdyz · updated 2h ago)
- 📁 autogpt_platform/backend/backend/copilot/sdk/
  - service.py (1 conflict, ~37 lines)

🟢 Low Risk — File Overlap Only

These PRs touch the same files but different sections (click to expand)

feat(platform): add nightly copilot automation flow #12407 (Swiftyos · updated 15m ago)
- Shared files: autogpt_platform/backend/backend/copilot/sdk/service.py

Summary: 3 conflict(s), 0 medium risk, 1 low risk (out of 4 PRs with file overlap)

Auto-generated on push. Ignores: openapi.json, lock files.

coderabbitai · 2026-03-13T09:54:53Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Service finalization now prefers the CLI session file for transcript uploads, adds a skip flag to avoid duplicate uploads, and deletes oversized transcripts on resume-mode "prompt is too long" streaming errors. Transcript utilities add safe CLI project path resolution and a reader for the latest CLI session file.

Changes

Cohort / File(s)	Summary
Service: error handling & upload flow `autogpt_platform/backend/backend/copilot/sdk/service.py`	Adds `skip_transcript_upload` state; on streaming errors indicating "prompt is too long" while resume is active, calls `delete_transcript(user_id, session_id)`, disables resume and sets `skip_transcript_upload` to avoid re-upload. Finalization prefers `read_cli_session_file(sdk_cwd)` and logs chosen source and size.
Transcript utilities & path safety `autogpt_platform/backend/backend/copilot/sdk/transcript.py`	Adds `_cli_project_dir(sdk_cwd) -> str
Tests `autogpt_platform/backend/backend/copilot/sdk/transcript_test.py`	Adds tests for `_cli_project_dir` and `read_cli_session_file` covering missing files, single `.jsonl`, and symlink/escape cases; integrates these into existing test suite.

Sequence Diagram(s)

sequenceDiagram
  participant Service as Service
  participant Transcript as TranscriptModule
  participant FS as CLI_FileSystem
  participant Storage as Upload/Storage

  Service->>Transcript: finalize(sdk_cwd) → read_cli_session_file(sdk_cwd)
  alt CLI session file found
    Transcript->>FS: locate newest *.jsonl within resolved project dir
    FS-->>Transcript: return file content
    Transcript-->>Service: session content
    Service->>Storage: upload(session content)
  else fallback
    Service->>Service: use TranscriptBuilder output
    Service->>Storage: upload(builder content)
  end

  Service->>Storage: streaming process
  alt error == "prompt is too long" and resume active
    Service->>Transcript: delete_transcript(user_id, session_id)
    Service-->>Service: set skip_transcript_upload = true
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

fix(copilot): always upload transcript instead of size-based skip #12303 — overlaps on copilot transcript upload flow and transcript utilities (read/upload/delete) changes.

Suggested reviewers

Swiftyos
Pwuts

Poem

🐰 I sniffed the CLI files, fresh and neat,
Found the newest jsonl for a tidy beat.
If prompts grow too long and streams go astray,
I nudge the old transcript and skip the replay.
Hoppity-hop — uploads sorted, logs okay! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title directly describes the main change: syncing TranscriptBuilder with CLI on mid-stream compaction, which is the primary fix and objective of this PR.
Description check	✅ Passed	The description is well-related to the changeset, explaining the root cause of transcript synchronization issues, the implemented fixes, and key changes made to address the problem.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/copilot-transcript-compaction

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1459-1464: The code currently gates transcript uploads on both
config.claude_agent_use_resume and the local variable use_resume, which prevents
new sessions or ones without a stored transcript from ever uploading fresh
transcripts; change the condition so uploads are allowed when
config.claude_agent_use_resume is true (and user_id and session are present)
regardless of use_resume. Specifically, modify the if that checks
(config.claude_agent_use_resume and use_resume and user_id and session is not
None) to remove use_resume from that conjunction (e.g.,
(config.claude_agent_use_resume and user_id and session is not None)); keep
use_resume only for the prompt-too-long suppression logic elsewhere so that
resume-related suppression remains separate from upload permission.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0a7eae5d-17bd-4702-b116-79067bd8b24f

📥 Commits

Reviewing files that changed from the base of the PR and between ba301a3 and 2adeb63.

📒 Files selected for processing (2)

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: types
GitHub Check: Seer Code Review
GitHub Check: test (3.11)
GitHub Check: test (3.12)
GitHub Check: test (3.13)
GitHub Check: Check PR Status

🧰 Additional context used

📓 Path-based instructions (4)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

🧠 Learnings (4)

📓 Common learnings

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:10030-10037
Timestamp: 2026-03-01T07:59:02.311Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — For MCP manual token storage, backend model autogpt_platform/backend/backend/api/features/mcp/routes.py defines MCPStoreTokenRequest.token as Pydantic SecretStr with a min length constraint, which generates OpenAPI schema metadata (format: "password", writeOnly: true, minLength: 1) in autogpt_platform/frontend/src/app/api/openapi.json. Prefer SecretStr (with length constraints) for sensitive request fields so generated TS clients and docs treat them as secrets.

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

…sion Use skip_transcript_upload instead of reusing use_resume to gate transcript uploads. use_resume starts False and only becomes True after a successful download, so gating on it prevented first-turn transcripts from ever being uploaded (bootstrap paradox).

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

autogpt_platform/backend/backend/copilot/sdk/service.py (1)

1471-1497: ⚠️ Potential issue | 🟡 Minor

Fallback to TranscriptBuilder when CLI transcript is invalid.

At Lines 1492-1497, an invalid CLI session file causes upload to be skipped entirely. That unnecessarily disables --resume next turn even though builder output is available.

💡 Suggested fallback behavior

                 cli_transcript = read_cli_session_file(sdk_cwd) if sdk_cwd else None
-                if cli_transcript:
+                if cli_transcript and validate_transcript(cli_transcript):
                     transcript_content = cli_transcript
                     logger.info(
                         "%s Using CLI session file for transcript upload " "(%d bytes)",
                         log_prefix,
                         len(cli_transcript),
                     )
                 else:
+                    if cli_transcript:
+                        logger.warning(
+                            "%s CLI session file invalid, falling back to TranscriptBuilder",
+                            log_prefix,
+                        )
                     transcript_content = transcript_builder.to_jsonl()
                     logger.info(
                         "%s CLI session file not available, using "
                         "TranscriptBuilder (%d bytes)",
                         log_prefix,
                         len(transcript_content) if transcript_content else 0,
                     )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1471 -
1497, The current logic reads cli_transcript via read_cli_session_file and if
present uses it unconditionally even if validate_transcript(cli_transcript)
would fail, causing upload to be skipped and preventing resume; change the flow
in the upload branch so that after reading cli_transcript you call
validate_transcript(cli_transcript) and if it fails, log that the CLI session
file is invalid and fall back to using transcript_builder.to_jsonl() (use
transcript_builder.entry_count for logs), then continue with the existing
validate_transcript check on transcript_content before deciding to upload or
skip; reference read_cli_session_file, cli_transcript, validate_transcript, and
transcript_builder.to_jsonl().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1068-1079: The finally block can re-upload the oversized
transcript if cancellation occurs while awaiting delete_transcript; to fix, set
skip_transcript_upload = True immediately before calling await
delete_transcript(...) (within the same scope where delete_transcript is called)
so that even if cancellation/exception happens during await the flag prevents
re-upload; update the try/except around delete_transcript in the same function
to set skip_transcript_upload prior to awaiting and keep the existing
logger.warning handling for exceptions (referencing delete_transcript,
skip_transcript_upload, and log_prefix).

---

Outside diff comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1471-1497: The current logic reads cli_transcript via
read_cli_session_file and if present uses it unconditionally even if
validate_transcript(cli_transcript) would fail, causing upload to be skipped and
preventing resume; change the flow in the upload branch so that after reading
cli_transcript you call validate_transcript(cli_transcript) and if it fails, log
that the CLI session file is invalid and fall back to using
transcript_builder.to_jsonl() (use transcript_builder.entry_count for logs),
then continue with the existing validate_transcript check on transcript_content
before deciding to upload or skip; reference read_cli_session_file,
cli_transcript, validate_transcript, and transcript_builder.to_jsonl().

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9639f402-d1de-4b72-bc03-83b3a9ddad53

📥 Commits

Reviewing files that changed from the base of the PR and between 2adeb63 and 6baeb11.

📒 Files selected for processing (1)

autogpt_platform/backend/backend/copilot/sdk/service.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: types
GitHub Check: Seer Code Review
GitHub Check: Check PR Status
GitHub Check: test (3.12)
GitHub Check: test (3.11)
GitHub Check: test (3.13)

🧰 Additional context used

📓 Path-based instructions (4)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

🧠 Learnings (5)

📓 Common learnings

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

📚 Learning: 2026-03-10T08:39:13.707Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

🔇 Additional comments (1)

autogpt_platform/backend/backend/copilot/sdk/service.py (1)

711-711: Good separation of resume state vs upload suppression.

This cleanly avoids the bootstrap paradox: transcript upload is no longer gated by use_resume, and suppression is isolated to skip_transcript_upload.

Also applies to: 1460-1465

Move flag assignment before the await to prevent re-upload if cancellation interrupts delete_transcript.

coderabbitai

🧹 Nitpick comments (1)

autogpt_platform/backend/backend/copilot/sdk/service.py (1)

1492-1503: Consider logging actual entry count when CLI transcript is used.

When the CLI session file is used, transcript_builder.entry_count (lines 1496, 1502) doesn't reflect the actual entry count of the uploaded content. This could be misleading during debugging.

💡 Optional fix to improve logging accuracy

+                # Count entries for logging (works for both CLI and builder sources)
+                entry_count = len(transcript_content.strip().split("\n")) if transcript_content else 0
+
                 if not transcript_content:
                     logger.warning(
                         "%s No transcript to upload (builder empty)", log_prefix
                     )
                 elif not validate_transcript(transcript_content):
                     logger.warning(
                         "%s Transcript invalid, skipping upload (entries=%d)",
                         log_prefix,
-                        transcript_builder.entry_count,
+                        entry_count,
                     )
                 else:
                     logger.info(
                         "%s Uploading complete transcript (entries=%d, bytes=%d)",
                         log_prefix,
-                        transcript_builder.entry_count,
+                        entry_count,
                         len(transcript_content),
                     )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1492 -
1503, The logs use transcript_builder.entry_count which can be stale when
transcript_content is sourced from a CLI session file; update the code around
validate_transcript/transcript_content to compute the real entry count from
transcript_content (e.g., parse transcript_content into entries or split into
lines/JSON entries to get actual_entry_count) and use that actual_entry_count in
the logger.warning and logger.info calls instead of
transcript_builder.entry_count; reference symbols: transcript_content,
transcript_builder.entry_count, validate_transcript, logger.warning,
logger.info.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1492-1503: The logs use transcript_builder.entry_count which can
be stale when transcript_content is sourced from a CLI session file; update the
code around validate_transcript/transcript_content to compute the real entry
count from transcript_content (e.g., parse transcript_content into entries or
split into lines/JSON entries to get actual_entry_count) and use that
actual_entry_count in the logger.warning and logger.info calls instead of
transcript_builder.entry_count; reference symbols: transcript_content,
transcript_builder.entry_count, validate_transcript, logger.warning,
logger.info.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fdf7d595-0544-42f1-ab58-50c30360b54f

📥 Commits

Reviewing files that changed from the base of the PR and between 6baeb11 and 152f54f.

📒 Files selected for processing (1)

autogpt_platform/backend/backend/copilot/sdk/service.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: types
GitHub Check: Seer Code Review
GitHub Check: test (3.13)
GitHub Check: test (3.12)
GitHub Check: test (3.11)
GitHub Check: Check PR Status

🧰 Additional context used

📓 Path-based instructions (4)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/sdk/service.py

🧠 Learnings (7)

📓 Common learnings

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

📚 Learning: 2026-03-10T08:39:13.707Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-03-10T08:38:30.834Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:30.834Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-03-04T23:58:18.476Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/service.py

🔇 Additional comments (4)

autogpt_platform/backend/backend/copilot/sdk/service.py (4)

77-86: LGTM!

The new imports for delete_transcript and read_cli_session_file are correctly added and align with the transcript module's public API as shown in the relevant code snippets.

711-711: LGTM!

The skip_transcript_upload flag is correctly initialized. This addresses the previous review feedback about using a dedicated flag rather than reusing use_resume for controlling transcript uploads.

1051-1079: LGTM!

The error handling for "prompt is too long" is well-implemented:

The skip_transcript_upload = True is correctly set before awaiting delete_transcript, addressing the previous review feedback about preventing re-upload on cancellation.

The detection logic using "prompt is too long" in err_str.lower() is reasonable for catching the relevant API error.

The delete_transcript call is wrapped in try/except to ensure deletion failures don't mask the original stream error.

1460-1486: LGTM — CLI session file preference correctly implemented.

The upload condition change addresses the previous review feedback: new sessions and sessions without stored transcripts will now upload correctly since use_resume is no longer in the gate condition.

The preference for CLI session file over TranscriptBuilder output is correct per the PR objectives — the CLI file reflects mid-stream compaction.

…l protection - Replace `import glob` with `pathlib.Path.glob()` in `read_cli_session_file` - Add symlink path traversal validation on glob results using `is_relative_to` - Add unit tests for `read_cli_session_file` and `_cli_project_dir`

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/transcript.py`:
- Around line 171-194: The scan currently can raise from p.resolve() or
os.path.getmtime causing read_cli_session_file to propagate errors; instead
iterate Path(project_dir).glob("*.jsonl") and for each candidate call
p.resolve() and p.resolve().is_relative_to(resolved_base) inside a try/except
that logs and skips problematic entries, collect safe paths and their mtimes
with os.path.getmtime guarded in try/except as well (e.g., build a list of
(path, mtime) only when both resolve and getmtime succeed), then pick the
max-safe entry as session_file; also ensure the file-open/read OSError already
caught returns None so no exception escapes. Use the existing symbols:
read_cli_session_file, resolved_base, jsonl_files/session_file, Path.resolve,
and os.path.getmtime when implementing these guarded checks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 455b26ad-7246-444f-8e70-a170ef4661b2

📥 Commits

Reviewing files that changed from the base of the PR and between 152f54f and a1f3431.

📒 Files selected for processing (2)

autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: types
GitHub Check: Seer Code Review
GitHub Check: test (3.11)
GitHub Check: test (3.13)
GitHub Check: test (3.12)
GitHub Check: Analyze (typescript)
GitHub Check: Analyze (python)
GitHub Check: Check PR Status

🧰 Additional context used

📓 Path-based instructions (6)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/backend/**/*_test.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/**/*_test.py: Always review snapshot changes with git diff before committing when updating snapshots with poetry run pytest --snapshot-update
Use pytest with snapshot testing for API responses in test files
Colocate test files with source files using the *_test.py naming convention

Files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py

autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

autogpt_platform/backend/**/*test*.py

📄 CodeRabbit inference engine (AGENTS.md)

Run poetry run test for backend testing (runs pytest with docker based postgres + prisma)

Files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py

🧠 Learnings (7)

📓 Common learnings

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

📚 Learning: 2026-02-04T16:49:42.490Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/**/test/**/*.py : Use snapshot testing with '--snapshot-update' flag in backend tests when output changes; always review with 'git diff'

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py

📚 Learning: 2026-02-04T16:50:20.508Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Always review snapshot changes with `git diff` before committing when updating snapshots with `poetry run pytest --snapshot-update`

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py

📚 Learning: 2026-02-04T16:50:20.508Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Colocate test files with source files using the `*_test.py` naming convention

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/copilot/sdk/transcript.py

🔇 Additional comments (2)

autogpt_platform/backend/backend/copilot/sdk/transcript_test.py (1)

292-371: Good coverage for the new CLI transcript safety paths.

These tests validate the key security/correctness paths (None on missing files, happy-path read, symlink escape rejection, and _cli_project_dir traversal guard) and match the implementation intent.

autogpt_platform/backend/backend/copilot/sdk/transcript.py (1)

141-155: Nice hardening on project-dir containment and cleanup reuse.

The _cli_project_dir escape check plus reuse inside cleanup_cli_project_dir is a clean security-focused refactor.

Also applies to: 198-209

…n_file Wrap p.resolve() and stat() calls in try/except to prevent unhandled OSError/RuntimeError from propagating and silently dropping transcript uploads when the caller falls back to TranscriptBuilder.

…-compaction

majdyz · 2026-03-13T16:49:01Z

Addressed all 4 review items in 8f0f6ce:

Multiple compactions bug: compaction_just_ended is now a one-shot flag that resets after being read, so subsequent mid-stream compactions trigger replace_entries correctly. Simplified service.py detection logic accordingly.
Path validation on transcript_path: read_compacted_entries() now validates that transcript_path is within the CLI projects directory (same pattern as _cli_project_dir).
Tests for new functions: Added tests for read_compacted_entries (7 cases), TranscriptBuilder.replace_entries (4 cases), and TranscriptBuilder.load_previous with compacted content (2 cases).
_transcript_path not reset: Added self._transcript_path = "" to reset_for_query().

Also: preserved isCompactSummary entries during STRIPPABLE_TYPES filtering in both load_previous and replace_entries.

majdyz

🤖 Critical Review — PR #12401

Thorough line-by-line review of the transcript compaction sync logic. 12 findings posted as inline comments.

Summary of HIGH severity issues:

compaction_just_ended property with side effects — mutating @property creates TOCTOU race and fragile ordering dependencies
First-compaction-summary bug — read_compacted_entries finds the FIRST isCompactSummary but should find the LAST (multi-compaction scenario)
Race between emit_end_if_ready and compaction_just_ended — asyncio.sleep(0) yield creates window for state corruption

Not inline (lines not in diff):

LOW (transcript.py:87): strip_progress_entries strips type: "summary" entries but does NOT have the isCompactSummary guard. After compaction + upload, the compacted summary is stripped and next turn's --resume gets incomplete transcript. Add: if entry.get("type", "") in STRIPPABLE_TYPES and not entry.get("isCompactSummary")
LOW (compaction.py:225): emit_start_if_ready checks not self._done but compaction_just_ended resets _done as a side effect. Reading the property before emit_start_if_ready inadvertently unblocks start emission — implicit ordering dependency.

majdyz

🤖 Critical Review — PR #12401

Thorough review of the transcript compaction sync logic. 10 findings below.

Not inline (lines not in diff):

LOW (transcript.py:87): strip_progress_entries strips type: "summary" but does NOT guard isCompactSummary. After compaction + upload, the compacted summary is stripped and next turn's --resume gets an incomplete transcript.
LOW (compaction.py:225): emit_start_if_ready checks not self._done but compaction_just_ended resets _done as a side effect. Reading the property before emit_start_if_ready inadvertently unblocks start emission.

… last-summary, dedup helpers - Replace side-effect @Property with CompactionResult dataclass for TOCTOU safety - Use build-then-swap in replace_entries to prevent data loss on corrupt input - Fix read_compacted_entries to use LAST isCompactSummary (not first) - Guard isCompactSummary in strip_progress_entries and TranscriptBuilder - Extract _projects_base(), _build_path_from_parts(), _build_meta_storage_path() - Extract TranscriptBuilder._parse_entry() static method - Sanitize transcript_path in pre_compact_hook before logging - Update compaction_test.py and transcript_test.py for new behaviors

majdyz · 2026-03-13T17:04:41Z

Addressed all remaining review items in 1023134:

Architecture:

CompactionResult dataclass — Replaced side-effect @property (compaction_just_ended) with atomic CompactionResult return from emit_end_if_ready, eliminating TOCTOU race
Build-then-swap in replace_entries — Builds new entry list first, validates non-empty, then swaps. Corrupt input cannot wipe conversation history

Bug fixes:
3. Last-summary selection — read_compacted_entries now finds the LAST isCompactSummary (not first), matching CLI behavior for multiple compactions
4. isCompactSummary guard — strip_progress_entries and TranscriptBuilder._parse_entry now preserve compaction summaries that would otherwise be stripped as "summary" type

Dedup / code quality:
5. Extracted helpers — _projects_base(), _build_path_from_parts(), _build_meta_storage_path(), TranscriptBuilder._parse_entry()
6. Log injection hardening — Sanitized transcript_path in pre_compact_hook before logging
7. Overwrite warning — on_compact() logs warning when transcript_path is overwritten

Tests:
8. Updated compaction_test.py for CompactionResult pattern, added transcript_path and multi-compaction tests
9. Updated transcript_test.py for last-summary and build-then-swap behaviors

Add isCompactSummary field to TranscriptEntry model so compaction summaries survive the export→load_previous roundtrip. Without this, exported summaries with type="summary" were stripped on reload since "summary" is in STRIPPABLE_TYPES. Also add integration tests simulating the full compaction flow (load → append → compact → replace → export → reload).

…stream compaction When compaction ends, replace_entries loads the CLI session file which already contains the current message. Skip append_assistant and append_tool_result for that iteration to avoid duplicates that could cause 'prompt is too long' errors on subsequent turns.

Exercises the full service.py compaction flow end-to-end: TranscriptBuilder load → CompactionTracker state machine → read_compacted_entries → replace_entries → export → roundtrip.

…ompaction (#12401) ## Summary - **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next `--resume` turn. - **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and call `TranscriptBuilder.replace_entries()` to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees. - **Key changes**: - `CompactionTracker`: stores `transcript_path` from `PreCompact` hook, one-shot `compaction_just_ended` flag that correctly resets for multiple compactions - `read_compacted_entries()`: reads CLI session JSONL, finds `isCompactSummary: true` entry, returns it + all entries after. Includes path validation against the CLI projects directory. - `TranscriptBuilder.replace_entries()`: clears and replaces all entries with compacted ones, preserving `isCompactSummary` entries (which have `type: "summary"` that would normally be stripped) - `load_previous()`: also preserves `isCompactSummary` entries when loading a previously compacted transcript - Service stream loop: after compaction ends, reads compacted entries and syncs TranscriptBuilder ## Test plan - [x] 69 tests pass across `compaction_test.py` and `transcript_test.py` - [x] Tests cover: one-shot flag behavior, multiple compactions within a query, transcript path storage, path traversal rejection, `read_compacted_entries` (7 tests), `replace_entries` (4 tests), `load_previous` with compacted content (2 tests) - [x] Pre-commit hooks pass (lint, format, typecheck) - [ ] Manual test: trigger compaction in a multi-turn session and verify the uploaded transcript reflects compaction

…ompaction (Significant-Gravitas#12401) ## Summary - **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next `--resume` turn. - **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and call `TranscriptBuilder.replace_entries()` to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees. - **Key changes**: - `CompactionTracker`: stores `transcript_path` from `PreCompact` hook, one-shot `compaction_just_ended` flag that correctly resets for multiple compactions - `read_compacted_entries()`: reads CLI session JSONL, finds `isCompactSummary: true` entry, returns it + all entries after. Includes path validation against the CLI projects directory. - `TranscriptBuilder.replace_entries()`: clears and replaces all entries with compacted ones, preserving `isCompactSummary` entries (which have `type: "summary"` that would normally be stripped) - `load_previous()`: also preserves `isCompactSummary` entries when loading a previously compacted transcript - Service stream loop: after compaction ends, reads compacted entries and syncs TranscriptBuilder ## Test plan - [x] 69 tests pass across `compaction_test.py` and `transcript_test.py` - [x] Tests cover: one-shot flag behavior, multiple compactions within a query, transcript path storage, path traversal rejection, `read_compacted_entries` (7 tests), `replace_entries` (4 tests), `load_previous` with compacted content (2 tests) - [x] Pre-commit hooks pass (lint, format, typecheck) - [ ] Manual test: trigger compaction in a multi-turn session and verify the uploaded transcript reflects compaction

majdyz requested a review from a team as a code owner March 13, 2026 09:53

majdyz requested review from Pwuts and Swiftyos and removed request for a team March 13, 2026 09:53

github-project-automation Bot added this to AutoGPT development kanban Mar 13, 2026

github-project-automation Bot moved this to 🆕 Needs initial review in AutoGPT development kanban Mar 13, 2026

github-actions Bot added platform/backend AutoGPT Platform - Back end size/l labels Mar 13, 2026

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated

github-actions Bot mentioned this pull request Mar 13, 2026

feat(platform): CoPilot credit charging, token rate limiting, and usage UI #12385

Merged

5 tasks

fix(backend/copilot): set skip_transcript_upload before await delete

152f54f

Move flag assignment before the await to prevent re-upload if cancellation interrupts delete_transcript.

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

majdyz commented Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated

majdyz commented Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated

majdyz commented Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated

majdyz commented Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated

majdyz commented Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py

This was referenced Mar 13, 2026

fix(backend/copilot): transcript compaction, token-based rate limiting, block credit charging, usage UI #12403

Closed

fix(backend/copilot): fix tool-result file read failures across turns #12399

Merged

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated

fix(backend/copilot): handle filesystem exceptions in read_cli_sessio…

8c7b077

…n_file Wrap p.resolve() and stat() calls in try/except to prevent unhandled OSError/RuntimeError from propagating and silently dropping transcript uploads when the caller falls back to TranscriptBuilder.

majdyz commented Mar 13, 2026

View reviewed changes

Merge remote-tracking branch 'origin/dev' into fix/copilot-transcript…

696f533

…-compaction

majdyz commented Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated

majdyz enabled auto-merge March 13, 2026 16:50

majdyz commented Mar 13, 2026

View reviewed changes

majdyz changed the title ~~fix(backend/copilot): use CLI session file for transcript upload to preserve compaction~~ fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction Mar 13, 2026

majdyz commented Mar 13, 2026

View reviewed changes

majdyz disabled auto-merge March 13, 2026 16:59

sentry Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/compaction.py

majdyz enabled auto-merge March 13, 2026 17:40

sentry Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py

majdyz added 2 commits March 14, 2026 01:17

test(backend/copilot): add e2e compaction lifecycle tests

7d95321

Exercises the full service.py compaction flow end-to-end: TranscriptBuilder load → CompactionTracker state machine → read_compacted_entries → replace_entries → export → roundtrip.

ntindle approved these changes Mar 13, 2026

View reviewed changes

majdyz added this pull request to the merge queue Mar 13, 2026

github-project-automation Bot moved this from 🆕 Needs initial review to 👍🏼 Mergeable in AutoGPT development kanban Mar 13, 2026

Merged via the queue into dev with commit cfe22e5 Mar 13, 2026
20 checks passed

majdyz deleted the fix/copilot-transcript-compaction branch March 13, 2026 22:35

github-project-automation Bot moved this from 👍🏼 Mergeable to ✅ Done in AutoGPT development kanban Mar 13, 2026

coderabbitai Bot mentioned this pull request Mar 13, 2026

fix(platform): try-compact-retry for prompt-too-long errors in CoPilot SDK #12413

Merged

10 tasks

coderabbitai Bot mentioned this pull request Apr 30, 2026

fix(copilot): bundle of chat stream stability fixes (PK dedup, race, compaction, errors, chips) #12948

Merged

7 tasks

Conversation

majdyz commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 PR Overlap Detection

🔴 Merge Conflicts Detected

🟢 Low Risk — File Overlap Only

Uh oh!

coderabbitai Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

majdyz commented Mar 13, 2026

Uh oh!

majdyz left a comment

Choose a reason for hiding this comment

🤖 Critical Review — PR #12401

Summary of HIGH severity issues:

Not inline (lines not in diff):

Uh oh!

majdyz left a comment

Choose a reason for hiding this comment

🤖 Critical Review — PR #12401

Not inline (lines not in diff):

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

majdyz commented Mar 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

majdyz commented Mar 13, 2026 •

edited

Loading

github-actions Bot commented Mar 13, 2026 •

edited

Loading

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading