Skip to content

fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction#12401

Merged
majdyz merged 14 commits into
devfrom
fix/copilot-transcript-compaction
Mar 13, 2026
Merged

fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction#12401
majdyz merged 14 commits into
devfrom
fix/copilot-transcript-compaction

Conversation

@majdyz
Copy link
Copy Markdown
Contributor

@majdyz majdyz commented Mar 13, 2026

Summary

  • Root cause: TranscriptBuilder accumulates all raw SDK stream messages including pre-compaction content. When the CLI compacts mid-stream, the uploaded transcript was still uncompacted, causing "Prompt is too long" errors on the next --resume turn.
  • Fix: Detect mid-stream compaction via the PreCompact hook, read the CLI's session file to get the compacted entries (summary + post-compaction messages), and call TranscriptBuilder.replace_entries() to sync it with the CLI's active context. This ensures the uploaded transcript always matches what the CLI sees.
  • Key changes:
    • CompactionTracker: stores transcript_path from PreCompact hook, one-shot compaction_just_ended flag that correctly resets for multiple compactions
    • read_compacted_entries(): reads CLI session JSONL, finds isCompactSummary: true entry, returns it + all entries after. Includes path validation against the CLI projects directory.
    • TranscriptBuilder.replace_entries(): clears and replaces all entries with compacted ones, preserving isCompactSummary entries (which have type: "summary" that would normally be stripped)
    • load_previous(): also preserves isCompactSummary entries when loading a previously compacted transcript
    • Service stream loop: after compaction ends, reads compacted entries and syncs TranscriptBuilder

Test plan

  • 69 tests pass across compaction_test.py and transcript_test.py
  • Tests cover: one-shot flag behavior, multiple compactions within a query, transcript path storage, path traversal rejection, read_compacted_entries (7 tests), replace_entries (4 tests), load_previous with compacted content (2 tests)
  • Pre-commit hooks pass (lint, format, typecheck)
  • Manual test: trigger compaction in a multi-turn session and verify the uploaded transcript reflects compaction

…reserve compaction

The TranscriptBuilder accumulates all raw SDK stream messages including
pre-compaction content. When the CLI compacts mid-stream, the uploaded
transcript still contains the full uncompacted messages, causing
"Prompt is too long" errors on the next --resume turn.

Fix:
- Read the CLI's own session file (~/.claude/projects/<cwd>/*.jsonl)
  which reflects mid-stream compaction, instead of TranscriptBuilder
- Extract _cli_project_dir() helper, refactor cleanup_cli_project_dir
- On "Prompt is too long" error with --resume, delete the oversized
  transcript so the next turn falls back to compression-based context
@majdyz majdyz requested a review from a team as a code owner March 13, 2026 09:53
@majdyz majdyz requested review from Pwuts and Swiftyos and removed request for a team March 13, 2026 09:53
@github-project-automation github-project-automation Bot moved this to 🆕 Needs initial review in AutoGPT development kanban Mar 13, 2026
@github-actions github-actions Bot added platform/backend AutoGPT Platform - Back end size/l labels Mar 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 13, 2026

🔍 PR Overlap Detection

This check compares your PR against all other open PRs targeting the same branch to detect potential merge conflicts early.

🔴 Merge Conflicts Detected

The following PRs have been tested and will have merge conflicts if merged after this PR. Consider coordinating with the authors.

🟢 Low Risk — File Overlap Only

These PRs touch the same files but different sections (click to expand)

Summary: 3 conflict(s), 0 medium risk, 1 low risk (out of 4 PRs with file overlap)


Auto-generated on push. Ignores: openapi.json, lock files.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Service finalization now prefers the CLI session file for transcript uploads, adds a skip flag to avoid duplicate uploads, and deletes oversized transcripts on resume-mode "prompt is too long" streaming errors. Transcript utilities add safe CLI project path resolution and a reader for the latest CLI session file.

Changes

Cohort / File(s) Summary
Service: error handling & upload flow
autogpt_platform/backend/backend/copilot/sdk/service.py
Adds skip_transcript_upload state; on streaming errors indicating "prompt is too long" while resume is active, calls delete_transcript(user_id, session_id), disables resume and sets skip_transcript_upload to avoid re-upload. Finalization prefers read_cli_session_file(sdk_cwd) and logs chosen source and size.
Transcript utilities & path safety
autogpt_platform/backend/backend/copilot/sdk/transcript.py
Adds `_cli_project_dir(sdk_cwd) -> str
Tests
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
Adds tests for _cli_project_dir and read_cli_session_file covering missing files, single .jsonl, and symlink/escape cases; integrates these into existing test suite.

Sequence Diagram(s)

sequenceDiagram
  participant Service as Service
  participant Transcript as TranscriptModule
  participant FS as CLI_FileSystem
  participant Storage as Upload/Storage

  Service->>Transcript: finalize(sdk_cwd) → read_cli_session_file(sdk_cwd)
  alt CLI session file found
    Transcript->>FS: locate newest *.jsonl within resolved project dir
    FS-->>Transcript: return file content
    Transcript-->>Service: session content
    Service->>Storage: upload(session content)
  else fallback
    Service->>Service: use TranscriptBuilder output
    Service->>Storage: upload(builder content)
  end

  Service->>Storage: streaming process
  alt error == "prompt is too long" and resume active
    Service->>Transcript: delete_transcript(user_id, session_id)
    Service-->>Service: set skip_transcript_upload = true
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested reviewers

  • Swiftyos
  • Pwuts

Poem

🐰 I sniffed the CLI files, fresh and neat,
Found the newest jsonl for a tidy beat.
If prompts grow too long and streams go astray,
I nudge the old transcript and skip the replay.
Hoppity-hop — uploads sorted, logs okay! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title directly describes the main change: syncing TranscriptBuilder with CLI on mid-stream compaction, which is the primary fix and objective of this PR.
Description check ✅ Passed The description is well-related to the changeset, explaining the root cause of transcript synchronization issues, the implemented fixes, and key changes made to address the problem.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/copilot-transcript-compaction
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1459-1464: The code currently gates transcript uploads on both
config.claude_agent_use_resume and the local variable use_resume, which prevents
new sessions or ones without a stored transcript from ever uploading fresh
transcripts; change the condition so uploads are allowed when
config.claude_agent_use_resume is true (and user_id and session are present)
regardless of use_resume. Specifically, modify the if that checks
(config.claude_agent_use_resume and use_resume and user_id and session is not
None) to remove use_resume from that conjunction (e.g.,
(config.claude_agent_use_resume and user_id and session is not None)); keep
use_resume only for the prompt-too-long suppression logic elsewhere so that
resume-related suppression remains separate from upload permission.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0a7eae5d-17bd-4702-b116-79067bd8b24f

📥 Commits

Reviewing files that changed from the base of the PR and between ba301a3 and 2adeb63.

📒 Files selected for processing (2)
  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: types
  • GitHub Check: Seer Code Review
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.13)
  • GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:10030-10037
Timestamp: 2026-03-01T07:59:02.311Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — For MCP manual token storage, backend model autogpt_platform/backend/backend/api/features/mcp/routes.py defines MCPStoreTokenRequest.token as Pydantic SecretStr with a min length constraint, which generates OpenAPI schema metadata (format: "password", writeOnly: true, minLength: 1) in autogpt_platform/frontend/src/app/api/openapi.json. Prefer SecretStr (with length constraints) for sensitive request fields so generated TS clients and docs treat them as secrets.
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
…sion

Use skip_transcript_upload instead of reusing use_resume to gate
transcript uploads. use_resume starts False and only becomes True
after a successful download, so gating on it prevented first-turn
transcripts from ever being uploaded (bootstrap paradox).
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
autogpt_platform/backend/backend/copilot/sdk/service.py (1)

1471-1497: ⚠️ Potential issue | 🟡 Minor

Fallback to TranscriptBuilder when CLI transcript is invalid.

At Lines 1492-1497, an invalid CLI session file causes upload to be skipped entirely. That unnecessarily disables --resume next turn even though builder output is available.

💡 Suggested fallback behavior
                 cli_transcript = read_cli_session_file(sdk_cwd) if sdk_cwd else None
-                if cli_transcript:
+                if cli_transcript and validate_transcript(cli_transcript):
                     transcript_content = cli_transcript
                     logger.info(
                         "%s Using CLI session file for transcript upload " "(%d bytes)",
                         log_prefix,
                         len(cli_transcript),
                     )
                 else:
+                    if cli_transcript:
+                        logger.warning(
+                            "%s CLI session file invalid, falling back to TranscriptBuilder",
+                            log_prefix,
+                        )
                     transcript_content = transcript_builder.to_jsonl()
                     logger.info(
                         "%s CLI session file not available, using "
                         "TranscriptBuilder (%d bytes)",
                         log_prefix,
                         len(transcript_content) if transcript_content else 0,
                     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1471 -
1497, The current logic reads cli_transcript via read_cli_session_file and if
present uses it unconditionally even if validate_transcript(cli_transcript)
would fail, causing upload to be skipped and preventing resume; change the flow
in the upload branch so that after reading cli_transcript you call
validate_transcript(cli_transcript) and if it fails, log that the CLI session
file is invalid and fall back to using transcript_builder.to_jsonl() (use
transcript_builder.entry_count for logs), then continue with the existing
validate_transcript check on transcript_content before deciding to upload or
skip; reference read_cli_session_file, cli_transcript, validate_transcript, and
transcript_builder.to_jsonl().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1068-1079: The finally block can re-upload the oversized
transcript if cancellation occurs while awaiting delete_transcript; to fix, set
skip_transcript_upload = True immediately before calling await
delete_transcript(...) (within the same scope where delete_transcript is called)
so that even if cancellation/exception happens during await the flag prevents
re-upload; update the try/except around delete_transcript in the same function
to set skip_transcript_upload prior to awaiting and keep the existing
logger.warning handling for exceptions (referencing delete_transcript,
skip_transcript_upload, and log_prefix).

---

Outside diff comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1471-1497: The current logic reads cli_transcript via
read_cli_session_file and if present uses it unconditionally even if
validate_transcript(cli_transcript) would fail, causing upload to be skipped and
preventing resume; change the flow in the upload branch so that after reading
cli_transcript you call validate_transcript(cli_transcript) and if it fails, log
that the CLI session file is invalid and fall back to using
transcript_builder.to_jsonl() (use transcript_builder.entry_count for logs),
then continue with the existing validate_transcript check on transcript_content
before deciding to upload or skip; reference read_cli_session_file,
cli_transcript, validate_transcript, and transcript_builder.to_jsonl().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9639f402-d1de-4b72-bc03-83b3a9ddad53

📥 Commits

Reviewing files that changed from the base of the PR and between 2adeb63 and 6baeb11.

📒 Files selected for processing (1)
  • autogpt_platform/backend/backend/copilot/sdk/service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: types
  • GitHub Check: Seer Code Review
  • GitHub Check: Check PR Status
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
📚 Learning: 2026-03-10T08:39:13.707Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
🔇 Additional comments (1)
autogpt_platform/backend/backend/copilot/sdk/service.py (1)

711-711: Good separation of resume state vs upload suppression.

This cleanly avoids the bootstrap paradox: transcript upload is no longer gated by use_resume, and suppression is isolated to skip_transcript_upload.

Also applies to: 1460-1465

Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Move flag assignment before the await to prevent re-upload if
cancellation interrupts delete_transcript.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
autogpt_platform/backend/backend/copilot/sdk/service.py (1)

1492-1503: Consider logging actual entry count when CLI transcript is used.

When the CLI session file is used, transcript_builder.entry_count (lines 1496, 1502) doesn't reflect the actual entry count of the uploaded content. This could be misleading during debugging.

💡 Optional fix to improve logging accuracy
+                # Count entries for logging (works for both CLI and builder sources)
+                entry_count = len(transcript_content.strip().split("\n")) if transcript_content else 0
+
                 if not transcript_content:
                     logger.warning(
                         "%s No transcript to upload (builder empty)", log_prefix
                     )
                 elif not validate_transcript(transcript_content):
                     logger.warning(
                         "%s Transcript invalid, skipping upload (entries=%d)",
                         log_prefix,
-                        transcript_builder.entry_count,
+                        entry_count,
                     )
                 else:
                     logger.info(
                         "%s Uploading complete transcript (entries=%d, bytes=%d)",
                         log_prefix,
-                        transcript_builder.entry_count,
+                        entry_count,
                         len(transcript_content),
                     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1492 -
1503, The logs use transcript_builder.entry_count which can be stale when
transcript_content is sourced from a CLI session file; update the code around
validate_transcript/transcript_content to compute the real entry count from
transcript_content (e.g., parse transcript_content into entries or split into
lines/JSON entries to get actual_entry_count) and use that actual_entry_count in
the logger.warning and logger.info calls instead of
transcript_builder.entry_count; reference symbols: transcript_content,
transcript_builder.entry_count, validate_transcript, logger.warning,
logger.info.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 1492-1503: The logs use transcript_builder.entry_count which can
be stale when transcript_content is sourced from a CLI session file; update the
code around validate_transcript/transcript_content to compute the real entry
count from transcript_content (e.g., parse transcript_content into entries or
split into lines/JSON entries to get actual_entry_count) and use that
actual_entry_count in the logger.warning and logger.info calls instead of
transcript_builder.entry_count; reference symbols: transcript_content,
transcript_builder.entry_count, validate_transcript, logger.warning,
logger.info.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fdf7d595-0544-42f1-ab58-50c30360b54f

📥 Commits

Reviewing files that changed from the base of the PR and between 6baeb11 and 152f54f.

📒 Files selected for processing (1)
  • autogpt_platform/backend/backend/copilot/sdk/service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: types
  • GitHub Check: Seer Code Review
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
  • GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
📚 Learning: 2026-03-10T08:39:13.707Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-10T08:38:30.834Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:30.834Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T23:58:18.476Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/service.py
🔇 Additional comments (4)
autogpt_platform/backend/backend/copilot/sdk/service.py (4)

77-86: LGTM!

The new imports for delete_transcript and read_cli_session_file are correctly added and align with the transcript module's public API as shown in the relevant code snippets.


711-711: LGTM!

The skip_transcript_upload flag is correctly initialized. This addresses the previous review feedback about using a dedicated flag rather than reusing use_resume for controlling transcript uploads.


1051-1079: LGTM!

The error handling for "prompt is too long" is well-implemented:

  1. The skip_transcript_upload = True is correctly set before awaiting delete_transcript, addressing the previous review feedback about preventing re-upload on cancellation.
  2. The detection logic using "prompt is too long" in err_str.lower() is reasonable for catching the relevant API error.
  3. The delete_transcript call is wrapped in try/except to ensure deletion failures don't mask the original stream error.

1460-1486: LGTM — CLI session file preference correctly implemented.

The upload condition change addresses the previous review feedback: new sessions and sessions without stored transcripts will now upload correctly since use_resume is no longer in the gate condition.

The preference for CLI session file over TranscriptBuilder output is correct per the PR objectives — the CLI file reflects mid-stream compaction.

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py
…l protection

- Replace `import glob` with `pathlib.Path.glob()` in `read_cli_session_file`
- Add symlink path traversal validation on glob results using `is_relative_to`
- Add unit tests for `read_cli_session_file` and `_cli_project_dir`
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/transcript.py`:
- Around line 171-194: The scan currently can raise from p.resolve() or
os.path.getmtime causing read_cli_session_file to propagate errors; instead
iterate Path(project_dir).glob("*.jsonl") and for each candidate call
p.resolve() and p.resolve().is_relative_to(resolved_base) inside a try/except
that logs and skips problematic entries, collect safe paths and their mtimes
with os.path.getmtime guarded in try/except as well (e.g., build a list of
(path, mtime) only when both resolve and getmtime succeed), then pick the
max-safe entry as session_file; also ensure the file-open/read OSError already
caught returns None so no exception escapes. Use the existing symbols:
read_cli_session_file, resolved_base, jsonl_files/session_file, Path.resolve,
and os.path.getmtime when implementing these guarded checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 455b26ad-7246-444f-8e70-a170ef4661b2

📥 Commits

Reviewing files that changed from the base of the PR and between 152f54f and a1f3431.

📒 Files selected for processing (2)
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: types
  • GitHub Check: Seer Code Review
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.12)
  • GitHub Check: Analyze (typescript)
  • GitHub Check: Analyze (python)
  • GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (6)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*_test.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/**/*_test.py: Always review snapshot changes with git diff before committing when updating snapshots with poetry run pytest --snapshot-update
Use pytest with snapshot testing for API responses in test files
Colocate test files with source files using the *_test.py naming convention

Files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
autogpt_platform/backend/**/*test*.py

📄 CodeRabbit inference engine (AGENTS.md)

Run poetry run test for backend testing (runs pytest with docker based postgres + prisma)

Files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:13.707Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
📚 Learning: 2026-02-04T16:49:42.490Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/**/test/**/*.py : Use snapshot testing with '--snapshot-update' flag in backend tests when output changes; always review with 'git diff'

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📚 Learning: 2026-02-04T16:50:20.508Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Always review snapshot changes with `git diff` before committing when updating snapshots with `poetry run pytest --snapshot-update`

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📚 Learning: 2026-02-04T16:50:20.508Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Colocate test files with source files using the `*_test.py` naming convention

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/transcript_test.py
  • autogpt_platform/backend/backend/copilot/sdk/transcript.py
🔇 Additional comments (2)
autogpt_platform/backend/backend/copilot/sdk/transcript_test.py (1)

292-371: Good coverage for the new CLI transcript safety paths.

These tests validate the key security/correctness paths (None on missing files, happy-path read, symlink escape rejection, and _cli_project_dir traversal guard) and match the implementation intent.

autogpt_platform/backend/backend/copilot/sdk/transcript.py (1)

141-155: Nice hardening on project-dir containment and cleanup reuse.

The _cli_project_dir escape check plus reuse inside cleanup_cli_project_dir is a clean security-focused refactor.

Also applies to: 198-209

Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated
…n_file

Wrap p.resolve() and stat() calls in try/except to prevent unhandled
OSError/RuntimeError from propagating and silently dropping transcript
uploads when the caller falls back to TranscriptBuilder.
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript_test.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 13, 2026

Addressed all 4 review items in 8f0f6ce:

  1. Multiple compactions bug: compaction_just_ended is now a one-shot flag that resets after being read, so subsequent mid-stream compactions trigger replace_entries correctly. Simplified service.py detection logic accordingly.

  2. Path validation on transcript_path: read_compacted_entries() now validates that transcript_path is within the CLI projects directory (same pattern as _cli_project_dir).

  3. Tests for new functions: Added tests for read_compacted_entries (7 cases), TranscriptBuilder.replace_entries (4 cases), and TranscriptBuilder.load_previous with compacted content (2 cases).

  4. _transcript_path not reset: Added self._transcript_path = "" to reset_for_query().

Also: preserved isCompactSummary entries during STRIPPABLE_TYPES filtering in both load_previous and replace_entries.

@majdyz majdyz enabled auto-merge March 13, 2026 16:50
Copy link
Copy Markdown
Contributor Author

@majdyz majdyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Critical Review — PR #12401

Thorough line-by-line review of the transcript compaction sync logic. 12 findings posted as inline comments.

Summary of HIGH severity issues:

  1. compaction_just_ended property with side effects — mutating @property creates TOCTOU race and fragile ordering dependencies
  2. First-compaction-summary bugread_compacted_entries finds the FIRST isCompactSummary but should find the LAST (multi-compaction scenario)
  3. Race between emit_end_if_ready and compaction_just_endedasyncio.sleep(0) yield creates window for state corruption

Not inline (lines not in diff):

  • LOW (transcript.py:87): strip_progress_entries strips type: "summary" entries but does NOT have the isCompactSummary guard. After compaction + upload, the compacted summary is stripped and next turn's --resume gets incomplete transcript. Add: if entry.get("type", "") in STRIPPABLE_TYPES and not entry.get("isCompactSummary")
  • LOW (compaction.py:225): emit_start_if_ready checks not self._done but compaction_just_ended resets _done as a side effect. Reading the property before emit_start_if_ready inadvertently unblocks start emission — implicit ordering dependency.

@majdyz majdyz changed the title fix(backend/copilot): use CLI session file for transcript upload to preserve compaction fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction Mar 13, 2026
Copy link
Copy Markdown
Contributor Author

@majdyz majdyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Critical Review — PR #12401

Thorough review of the transcript compaction sync logic. 10 findings below.

Not inline (lines not in diff):

  • LOW (transcript.py:87): strip_progress_entries strips type: "summary" but does NOT guard isCompactSummary. After compaction + upload, the compacted summary is stripped and next turn's --resume gets an incomplete transcript.
  • LOW (compaction.py:225): emit_start_if_ready checks not self._done but compaction_just_ended resets _done as a side effect. Reading the property before emit_start_if_ready inadvertently unblocks start emission.

Comment thread autogpt_platform/backend/backend/copilot/sdk/compaction.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/compaction.py
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript_builder.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/security_hooks.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/transcript_builder.py Outdated
@majdyz majdyz disabled auto-merge March 13, 2026 16:59
… last-summary, dedup helpers

- Replace side-effect @Property with CompactionResult dataclass for TOCTOU safety
- Use build-then-swap in replace_entries to prevent data loss on corrupt input
- Fix read_compacted_entries to use LAST isCompactSummary (not first)
- Guard isCompactSummary in strip_progress_entries and TranscriptBuilder
- Extract _projects_base(), _build_path_from_parts(), _build_meta_storage_path()
- Extract TranscriptBuilder._parse_entry() static method
- Sanitize transcript_path in pre_compact_hook before logging
- Update compaction_test.py and transcript_test.py for new behaviors
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 13, 2026

Addressed all remaining review items in 1023134:

Architecture:

  1. CompactionResult dataclass — Replaced side-effect @property (compaction_just_ended) with atomic CompactionResult return from emit_end_if_ready, eliminating TOCTOU race
  2. Build-then-swap in replace_entries — Builds new entry list first, validates non-empty, then swaps. Corrupt input cannot wipe conversation history

Bug fixes:
3. Last-summary selectionread_compacted_entries now finds the LAST isCompactSummary (not first), matching CLI behavior for multiple compactions
4. isCompactSummary guardstrip_progress_entries and TranscriptBuilder._parse_entry now preserve compaction summaries that would otherwise be stripped as "summary" type

Dedup / code quality:
5. Extracted helpers_projects_base(), _build_path_from_parts(), _build_meta_storage_path(), TranscriptBuilder._parse_entry()
6. Log injection hardening — Sanitized transcript_path in pre_compact_hook before logging
7. Overwrite warningon_compact() logs warning when transcript_path is overwritten

Tests:
8. Updated compaction_test.py for CompactionResult pattern, added transcript_path and multi-compaction tests
9. Updated transcript_test.py for last-summary and build-then-swap behaviors

Comment thread autogpt_platform/backend/backend/copilot/sdk/compaction.py
Add isCompactSummary field to TranscriptEntry model so compaction
summaries survive the export→load_previous roundtrip. Without this,
exported summaries with type="summary" were stripped on reload since
"summary" is in STRIPPABLE_TYPES.

Also add integration tests simulating the full compaction flow
(load → append → compact → replace → export → reload).
@majdyz majdyz enabled auto-merge March 13, 2026 17:40
Comment thread autogpt_platform/backend/backend/copilot/sdk/service.py
majdyz added 2 commits March 14, 2026 01:17
…stream compaction

When compaction ends, replace_entries loads the CLI session file which
already contains the current message. Skip append_assistant and
append_tool_result for that iteration to avoid duplicates that could
cause 'prompt is too long' errors on subsequent turns.
Exercises the full service.py compaction flow end-to-end:
TranscriptBuilder load → CompactionTracker state machine →
read_compacted_entries → replace_entries → export → roundtrip.
@majdyz majdyz added this pull request to the merge queue Mar 13, 2026
@github-project-automation github-project-automation Bot moved this from 🆕 Needs initial review to 👍🏼 Mergeable in AutoGPT development kanban Mar 13, 2026
Merged via the queue into dev with commit cfe22e5 Mar 13, 2026
20 checks passed
@majdyz majdyz deleted the fix/copilot-transcript-compaction branch March 13, 2026 22:35
@github-project-automation github-project-automation Bot moved this from 👍🏼 Mergeable to ✅ Done in AutoGPT development kanban Mar 13, 2026
Bentlybro pushed a commit that referenced this pull request Apr 4, 2026
…ompaction (#12401)

## Summary
- **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream
messages including pre-compaction content. When the CLI compacts
mid-stream, the uploaded transcript was still uncompacted, causing
"Prompt is too long" errors on the next `--resume` turn.
- **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read
the CLI's session file to get the compacted entries (summary +
post-compaction messages), and call
`TranscriptBuilder.replace_entries()` to sync it with the CLI's active
context. This ensures the uploaded transcript always matches what the
CLI sees.
- **Key changes**:
- `CompactionTracker`: stores `transcript_path` from `PreCompact` hook,
one-shot `compaction_just_ended` flag that correctly resets for multiple
compactions
- `read_compacted_entries()`: reads CLI session JSONL, finds
`isCompactSummary: true` entry, returns it + all entries after. Includes
path validation against the CLI projects directory.
- `TranscriptBuilder.replace_entries()`: clears and replaces all entries
with compacted ones, preserving `isCompactSummary` entries (which have
`type: "summary"` that would normally be stripped)
- `load_previous()`: also preserves `isCompactSummary` entries when
loading a previously compacted transcript
- Service stream loop: after compaction ends, reads compacted entries
and syncs TranscriptBuilder

## Test plan
- [x] 69 tests pass across `compaction_test.py` and `transcript_test.py`
- [x] Tests cover: one-shot flag behavior, multiple compactions within a
query, transcript path storage, path traversal rejection,
`read_compacted_entries` (7 tests), `replace_entries` (4 tests),
`load_previous` with compacted content (2 tests)
- [x] Pre-commit hooks pass (lint, format, typecheck)
- [ ] Manual test: trigger compaction in a multi-turn session and verify
the uploaded transcript reflects compaction
Bentlybro pushed a commit to Bentlybro/AutoGPT that referenced this pull request Apr 4, 2026
…ompaction (Significant-Gravitas#12401)

## Summary
- **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream
messages including pre-compaction content. When the CLI compacts
mid-stream, the uploaded transcript was still uncompacted, causing
"Prompt is too long" errors on the next `--resume` turn.
- **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read
the CLI's session file to get the compacted entries (summary +
post-compaction messages), and call
`TranscriptBuilder.replace_entries()` to sync it with the CLI's active
context. This ensures the uploaded transcript always matches what the
CLI sees.
- **Key changes**:
- `CompactionTracker`: stores `transcript_path` from `PreCompact` hook,
one-shot `compaction_just_ended` flag that correctly resets for multiple
compactions
- `read_compacted_entries()`: reads CLI session JSONL, finds
`isCompactSummary: true` entry, returns it + all entries after. Includes
path validation against the CLI projects directory.
- `TranscriptBuilder.replace_entries()`: clears and replaces all entries
with compacted ones, preserving `isCompactSummary` entries (which have
`type: "summary"` that would normally be stripped)
- `load_previous()`: also preserves `isCompactSummary` entries when
loading a previously compacted transcript
- Service stream loop: after compaction ends, reads compacted entries
and syncs TranscriptBuilder

## Test plan
- [x] 69 tests pass across `compaction_test.py` and `transcript_test.py`
- [x] Tests cover: one-shot flag behavior, multiple compactions within a
query, transcript path storage, path traversal rejection,
`read_compacted_entries` (7 tests), `replace_entries` (4 tests),
`load_previous` with compacted content (2 tests)
- [x] Pre-commit hooks pass (lint, format, typecheck)
- [ ] Manual test: trigger compaction in a multi-turn session and verify
the uploaded transcript reflects compaction
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

platform/backend AutoGPT Platform - Back end size/l size/xl

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

2 participants