fix(backend/copilot): prevent duplicate block execution from pre-launch arg mismatch by majdyz · Pull Request #12632 · Significant-Gravitas/AutoGPT

majdyz · 2026-03-31T16:37:54Z

Why

CoPilot sessions are duplicating Linear tickets and GitHub PRs. Investigation of 5 production sessions (March 31st) found that 3/5 created duplicate Linear issues — each with consecutive IDs at the exact same timestamp, but only one visible in Langfuse traces.

Production gcloud logs confirm: 279 arg mismatch warnings per day, 37 duplicate block execution pairs, and all LinearCreateIssueBlock failures in pairs.

Related: SECRT-2204

What

Replace the speculative pre-launch mechanism with the SDK's native parallel dispatch via readOnlyHint tool annotations. Remove ~580 lines of pre-launch infrastructure code.

How

Root cause

The pre-launch mechanism had three compounding bugs:

Arg mismatch: The SDK CLI normalises args between the AssistantMessage (used for pre-launch) and the MCP tools/call dispatch, causing frequent mismatches (279/day in prod)
FIFO desync on denial: Security hooks can deny tool calls, causing the CLI to skip the MCP dispatch — but the pre-launched task stays in the FIFO queue, misaligning all subsequent matches
Cancel race: task.cancel() is best-effort in asyncio — if the HTTP call to Linear/GitHub already completed, the side effect is irreversible

Fix

Removed pre_launch_tool_call(), cancel_pending_tool_tasks(), _tool_task_queues ContextVar, all FIFO queue logic, and all 4 cancel_pending_tool_tasks() calls in service.py
Added readOnlyHint=True annotations on 15+ read-only tools (find_block, search_docs, list_workspace_files, etc.) — the SDK CLI natively dispatches these in parallel (ref: anthropics/claude-code#14353)
Side-effect tools (run_block, bash_exec, create_agent, etc.) have no annotation → CLI runs them sequentially → no duplicate execution risk

Net change: -578 lines, +105 lines

…ch arg mismatch The pre-launch mechanism speculatively starts tool execution when an AssistantMessage arrives, then matches results to SDK MCP dispatches via FIFO queue. A strict arg equality check discarded pre-launched results when the SDK CLI normalised args (e.g. injecting schema defaults), causing duplicate execution of blocks with side effects like LinearCreateIssueBlock and GithubCreatePullRequestBlock. Trust FIFO ordering instead — the SDK dispatches MCP tool calls in the same order as ToolUseBlocks in the AssistantMessage. Log at debug level when args differ for observability.

coderabbitai · 2026-03-31T16:38:25Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Removed speculative per-session tool pre-launch/queueing; tool handler now always invokes synchronous execution path. Added read-only annotations (ToolAnnotations(readOnlyHint=True) / BaseTool.read_only) for selected tools and SDK Read; service stops cancelling previously pre-launched tasks and no longer pre-launches per-tool.

Changes

Cohort / File(s)	Summary
Tool adapter logic `autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py`	Removed ContextVar `_tool_task_queues`, `set_execution_context`, `pre_launch_tool_call`, `cancel_pending_tool_tasks`. `create_tool_handler().tool_handler` no longer dequeues/validates pre-launched tasks and always calls `_execute_tool_sync(...)`. Added `_READONLY_ANNOTATION` and `_READ_ONLY_E2B_TOOLS` and register readOnly annotations for applicable tools (including SDK `Read` and E2B tools).
Service coordination `autogpt_platform/backend/backend/copilot/sdk/service.py`	Removed imports and calls to `cancel_pending_tool_tasks()`; stopped per-tool pre-launching on incoming AssistantMessage. Retained `is_tool_only` computation for flush/stash-wait logic; updated comments to reflect delegation of parallel-read behavior to SDK annotations.
Tool base and tools `autogpt_platform/backend/backend/copilot/tools/base.py`, `.../tools/*`	Added `BaseTool.read_only` property (defaults to False). Many tools (browser screenshot, agent output, search/docs, find_, get_, web_fetch, workspace readers, etc.) now expose `read_only` properties returning True to mark them non-mutating.
Tests `autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py`	Removed tests covering pre-launch/parallel-dispatch and related helpers; narrowed tests to direct execution semantics (single run, exception → MCP error, missing session). Added assertions verifying read-only classification and `_READ_ONLY_E2B_TOOLS` contents.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Service
    participant ToolAdapter as CopilotAdapter
    participant Executor as ToolExecutor

    Client->>Service: send assistant message / tool request
    Service->>ToolAdapter: route to tool handler
    ToolAdapter->>ToolAdapter: determine tool and annotations (readOnlyHint)
    ToolAdapter->>Executor: call _execute_tool_sync(...)
    Executor-->>ToolAdapter: return result / raise exception
    ToolAdapter-->>Service: return MCP result or _mcp_error
    Service-->>Client: deliver response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(copilot): Execute parallel tool calls concurrently #12165 — directly overlaps copilot parallel-execution plumbing and presents an alternative concurrent execution approach.
feat(copilot): SDK tool output, transcript resume, stream reconnection, GenericTool UI #12159 — touches tool_adapter.py parallel/queueing and tool-output coordination; likely to conflict with removed pre-launch APIs.
feat(copilot): E2B cloud sandbox — unified file tools, persistent execution, output truncation #12212 — modifies set_execution_context/tool registration flow and is related to ContextVar lifecycle and read-only registration changes.

Suggested reviewers

0ubbe
ntindle
Pwuts
Swiftyos

Poem

🐰 I shelved the hopeful pre-launch race,
Tasks now meet their moment, face to face,
Read-only tools wear a gentle sign,
Ready for parallel, but by design,
I hop along, content with steadier pace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: removing the pre-launch mechanism to prevent duplicate block execution caused by arg mismatch, which aligns with the core purpose of the PR.
Description check	✅ Passed	The description clearly explains the why (duplicate production issues), what (replace pre-launch with readOnlyHint annotations), and how (root causes and fixes), directly corresponding to the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/copilot-duplicate-block-execution

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

majdyz · 2026-03-31T16:40:53Z

🤖 🔵 Nit: The pre_launch_tool_call docstring (lines 282-288) still references the (task, args) tuple pattern and ordering mismatch detection. Should be updated to reflect the simpler task-only queue — the handler no longer checks args ordering.

Address self-review: remove the dead arg comparison and simplify the queue from (task, args) tuples to plain tasks. The handler trusts FIFO ordering unconditionally — no args are stored or compared. Update docstring and type alias accordingly.

coderabbitai

🧹 Nitpick comments (1)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py (1)

522-544: Make this test prove which args actually executed.

With a constant mock result, this still passes if the handler skips the queued task and performs one direct call with "normalised" args. Make the return value depend on block_id, or assert mock_tool.execute.await_args.kwargs["block_id"] == "original", so the test really locks in the pre-launched path.

💡 One way to pin the pre-launched call

-        mock_tool = _make_mock_tool("run_block", output="pre-launched-result")
+        async def execute_with_block_id(*_args, **kwargs):
+            block_id = kwargs["block_id"]
+            return StreamToolOutputAvailable(
+                toolCallId="test-id",
+                output=f"result-for-{block_id}",
+                toolName="run_block",
+                success=True,
+            )
+
+        mock_tool = _make_mock_tool("run_block")
+        mock_tool.execute = AsyncMock(side_effect=execute_with_block_id)
@@
-        assert "pre-launched-result" in result["content"][0]["text"]
+        assert "result-for-original" in result["content"][0]["text"]
+        assert mock_tool.execute.await_args.kwargs["block_id"] == "original"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py` around
lines 522 - 544, Update the test_arg_mismatch_uses_prelaunched_result test so it
proves the pre-launched args were actually used: make the mock_tool's return
depend on the incoming block_id (so "original" returns "pre-launched-result" and
"normalised" would return something else) or add an assertion that inspects the
executed call on mock_tool (e.g., check
mock_tool.execute.await_args.kwargs["block_id"] == "original") after running
handler; keep references to the existing helpers (pre_launch_tool_call and
create_tool_handler) and ensure the final assertions still verify only one
execution occurred (mock_tool.execute.await_count == 1) and the observed result
matches the "original" block_id outcome.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py`:
- Around line 522-544: Update the test_arg_mismatch_uses_prelaunched_result test
so it proves the pre-launched args were actually used: make the mock_tool's
return depend on the incoming block_id (so "original" returns
"pre-launched-result" and "normalised" would return something else) or add an
assertion that inspects the executed call on mock_tool (e.g., check
mock_tool.execute.await_args.kwargs["block_id"] == "original") after running
handler; keep references to the existing helpers (pre_launch_tool_call and
create_tool_handler) and ensure the final assertions still verify only one
execution occurred (mock_tool.execute.await_count == 1) and the observed result
matches the "original" block_id outcome.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 58649ac0-1d79-41f1-8bcd-c95ab83b513b

📥 Commits

Reviewing files that changed from the base of the PR and between 57b17dc and 959c539.

📒 Files selected for processing (2)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: check API types
GitHub Check: Seer Code Review
GitHub Check: type-check (3.13)
GitHub Check: test (3.11)
GitHub Check: test (3.13)
GitHub Check: test (3.12)
GitHub Check: conflicts
GitHub Check: end-to-end tests
GitHub Check: Analyze (python)
GitHub Check: Check PR Status

🧰 Additional context used

📓 Path-based instructions (4)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Refer to @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks

autogpt_platform/backend/**/*.py: Import only at the top level; no local/inner imports except for lazy imports of heavy optional dependencies like openpyxl
Use absolute imports with from backend.module import ... for cross-package imports; single-dot relative imports (from .sibling import ...) are acceptable for sibling modules within the same package; avoid double-dot relative imports (from ..parent import ...)
Do not use duck typing with hasattr(), getattr(), or isinstance() for type dispatch; use typed interfaces, unions, or protocols instead
Use Pydantic models for structured data instead of dataclasses, namedtuples, or dicts
Do not use linter suppressors; no # type: ignore, # noqa, or # pyright: ignore comments — fix the underlying type/code issue instead
Use list comprehensions instead of manual loop-and-append patterns
Use early return guard clauses to avoid deep nesting
Use %s for deferred interpolation in debug log statements; use f-strings for readability in other log levels (e.g., logger.debug("Processing %s items", count), logger.info(f"Processing {count} items"))
Sanitize error paths using os.path.basename() in error messages to avoid leaking directory structure
Avoid TOCTOU (time-of-check-time-of-use) patterns; do not use check-then-act patterns for file access and credit charging operations
Use Redis pipelines with transaction=True for atomicity on multi-step Redis operations
Use max(0, value) guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract h...

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

autogpt_platform/backend/**/*test*.py

📄 CodeRabbit inference engine (AGENTS.md)

Run poetry run test for backend testing (runs pytest with docker based postgres + prisma)

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

autogpt_platform/backend/**/*_test.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/**/*_test.py: Use pytest with snapshot testing for API responses; test files should be colocated with source files using the *_test.py naming pattern
Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths
Use AsyncMock from unittest.mock for mocking async functions

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

🧠 Learnings (8)

📓 Common learnings

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:22.025Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12439
File: autogpt_platform/backend/backend/blocks/autogpt_copilot.py:0-0
Timestamp: 2026-03-16T17:00:02.827Z
Learning: In autogpt_platform/backend/backend/blocks/autogpt_copilot.py, the recursion guard uses two module-level ContextVars: `_copilot_recursion_depth` (tracks current nesting depth) and `_copilot_recursion_limit` (stores the chain-wide ceiling). On the first invocation, `_copilot_recursion_limit` is set to `max_recursion_depth`; nested calls use `min(inherited_limit, max_recursion_depth)`, so they can only lower the cap, never raise it. The entry/exit logic is extracted into module-level helper functions. This is the approved pattern for preventing runaway sub-agent recursion in AutogptCopilotBlock (PR `#12439`, commits 348e9f8e2 and 3b70f61b1).

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:297-300
Timestamp: 2026-03-10T08:38:33.249Z
Learning: In autogpt_platform/backend/backend/copilot/tools/run_block.py, the auto-approval key for sensitive block HITL review uses graph_exec_id (copilot-session-{session_id}) + node_id (copilot-node-{block_id}). This is intentional: approving a block type within a CoPilot session auto-approves all future invocations of that same block type within the same session, mirroring how auto-approve works in normal graph execution. The user explicitly opts into this session-scoped behavior via an auto-approve toggle. Without the toggle (default), each individual invocation requires its own approval.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:36.655Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.

📚 Learning: 2026-03-17T10:57:12.953Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-16T16:35:40.236Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-31T15:37:38.626Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-25T06:59:27.340Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-03-25T06:59:27.340Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

🔇 Additional comments (1)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py (1)

467-488: Drop the debug-only arg deep-compare.

Now that mismatches no longer change control flow, launch_args != args only keeps the old cost of retaining and deep-comparing the whole payload. Logging that a queued task was consumed is enough here, and it would let _TaskQueueItem stop carrying large run_block arg dicts around.

…annotations Remove the speculative pre-launch mechanism that caused duplicate block execution (e.g. duplicate Linear tickets, GitHub PRs) due to: - Arg mismatch between AssistantMessage and SDK MCP dispatch - Broken FIFO assumption when security hooks deny a tool - Race condition where task.cancel() arrives after API call completes Instead, use the SDK's native parallel dispatch: annotate read-only tools with ToolAnnotations(readOnlyHint=True) so the CLI dispatches them concurrently. Side-effect tools (run_block, bash_exec, etc.) run sequentially — correct and safe. Removed: - pre_launch_tool_call(), cancel_pending_tool_tasks() - _tool_task_queues ContextVar and all queue logic - Pre-launch calls in service.py streaming loop - All pre-launch tests (replaced with annotation tests) Added: - _READ_ONLY_TOOLS / _READ_ONLY_E2B_TOOLS sets - ToolAnnotations(readOnlyHint=True) on 15+ read-only tools - Tests for annotation classification

majdyz · 2026-03-31T17:14:44Z

🤖 🔵 Nit (Round 2): The _READ_ONLY_E2B_TOOLS set remains hardcoded since E2B file tools don't use BaseTool. Add a comment explaining this asymmetry so future maintainers don't wonder why E2B tools aren't derived like registry tools.

majdyz · 2026-03-31T17:15:33Z

🤖 🔵 Nit (Round 3): The create_copilot_mcp_server docstring should mention that readOnlyHint is now derived from BaseTool.read_only (not a hardcoded set), so developers know to set the property on new tools rather than editing a separate list.

…y property Address review: instead of a hardcoded _READ_ONLY_TOOLS frozenset, add a read_only property to BaseTool (default False) and override it to True on 15 read-only tool classes. create_copilot_mcp_server now derives annotations from base_tool.read_only so new tools only need to set the property, not edit a separate list.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py (1)

358-376: Add regression assertions for write-capable tools in read-only classification tests.

Given the new readOnlyHint path, this suite should also assert that write-capable tools (notably read_workspace_file with save_to_path, and browser_screenshot) are not treated as read-only.

Possible test extension

     def test_side_effect_tools_are_not_read_only(self):
         """Key side-effect tools should have read_only=False."""
         from backend.copilot.tools import TOOL_REGISTRY

-        for name in ["run_block", "bash_exec", "create_agent", "run_agent"]:
+        for name in [
+            "run_block",
+            "bash_exec",
+            "create_agent",
+            "run_agent",
+            "read_workspace_file",
+            "browser_screenshot",
+        ]:
             if name in TOOL_REGISTRY:
                 assert not TOOL_REGISTRY[
                     name
                 ].read_only, f"{name} should not be read_only"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py` around
lines 358 - 376, Add assertions to the read-only classification test to ensure
write-capable tools are not misclassified: update
test_read_only_e2b_tools_classification (or the surrounding test block
referencing _READ_ONLY_E2B_TOOLS and TOOL_REGISTRY) to assert that
"read_workspace_file" (when used with save_to_path capability) and
"browser_screenshot" are NOT in _READ_ONLY_E2B_TOOLS and, if present in
TOOL_REGISTRY, that their ToolAnnotations/readOnlyHint or tool.read_only is
False; use the existing symbols _READ_ONLY_E2B_TOOLS, TOOL_REGISTRY,
ToolAnnotations and the tool names "read_workspace_file" and
"browser_screenshot" to locate and add these negative assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/tools/agent_browser.py`:
- Around line 768-770: The BrowserScreenshotTool is marked read_only via the
read_only property returning True while it performs a side-effect (writes a file
using WriteWorkspaceFileTool); update the read_only property on the
BrowserScreenshotTool class to return False (or remove the read_only override so
it inherits a non-read-only default) so the tool is not classified as read-only,
and verify any callers/registrations relying on read-only behavior are adjusted
accordingly; locate the read_only property in BrowserScreenshotTool and change
its return value and/or documentation to reflect that it mutates workspace state
via WriteWorkspaceFileTool.

In `@autogpt_platform/backend/backend/copilot/tools/workspace_files.py`:
- Around line 566-568: The ReadWorkspaceFileTool is incorrectly marked read-only
which allows its mutating method _save_to_path (invoked via save_to_path) to run
in parallel; change the read_only property on the ReadWorkspaceFileTool class so
it returns False (i.e., clear that this tool is not read-only) to prevent
parallel execution of its mutating operation and ensure callers treat it as a
writer rather than a read-only task.

---

Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py`:
- Around line 358-376: Add assertions to the read-only classification test to
ensure write-capable tools are not misclassified: update
test_read_only_e2b_tools_classification (or the surrounding test block
referencing _READ_ONLY_E2B_TOOLS and TOOL_REGISTRY) to assert that
"read_workspace_file" (when used with save_to_path capability) and
"browser_screenshot" are NOT in _READ_ONLY_E2B_TOOLS and, if present in
TOOL_REGISTRY, that their ToolAnnotations/readOnlyHint or tool.read_only is
False; use the existing symbols _READ_ONLY_E2B_TOOLS, TOOL_REGISTRY,
ToolAnnotations and the tool names "read_workspace_file" and
"browser_screenshot" to locate and add these negative assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 247a39ef-eb95-486b-9603-01d0bac1af58

📥 Commits

Reviewing files that changed from the base of the PR and between 4cb138f and 23cdf29.

📒 Files selected for processing (17)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/validate_agent.py
autogpt_platform/backend/backend/copilot/tools/web_fetch.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py

✅ Files skipped from review due to trivial changes (2)

autogpt_platform/backend/backend/copilot/tools/validate_agent.py
autogpt_platform/backend/backend/copilot/tools/web_fetch.py

🚧 Files skipped from review as they are similar to previous changes (1)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: check API types
GitHub Check: end-to-end tests
GitHub Check: Analyze (typescript)
GitHub Check: Analyze (python)
GitHub Check: test (3.13)
GitHub Check: test (3.11)
GitHub Check: type-check (3.13)
GitHub Check: test (3.12)
GitHub Check: type-check (3.11)
GitHub Check: Check PR Status
GitHub Check: Seer Code Review

🧰 Additional context used

📓 Path-based instructions (4)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Refer to @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks

autogpt_platform/backend/**/*.py: Import only at the top level; no local/inner imports except for lazy imports of heavy optional dependencies like openpyxl
Use absolute imports with from backend.module import ... for cross-package imports; single-dot relative imports (from .sibling import ...) are acceptable for sibling modules within the same package; avoid double-dot relative imports (from ..parent import ...)
Do not use duck typing with hasattr(), getattr(), or isinstance() for type dispatch; use typed interfaces, unions, or protocols instead
Use Pydantic models for structured data instead of dataclasses, namedtuples, or dicts
Do not use linter suppressors; no # type: ignore, # noqa, or # pyright: ignore comments — fix the underlying type/code issue instead
Use list comprehensions instead of manual loop-and-append patterns
Use early return guard clauses to avoid deep nesting
Use %s for deferred interpolation in debug log statements; use f-strings for readability in other log levels (e.g., logger.debug("Processing %s items", count), logger.info(f"Processing {count} items"))
Sanitize error paths using os.path.basename() in error messages to avoid leaking directory structure
Avoid TOCTOU (time-of-check-time-of-use) patterns; do not use check-then-act patterns for file access and credit charging operations
Use Redis pipelines with transaction=True for atomicity on multi-step Redis operations
Use max(0, value) guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract h...

Files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

autogpt_platform/backend/**/*test*.py

📄 CodeRabbit inference engine (AGENTS.md)

Run poetry run test for backend testing (runs pytest with docker based postgres + prisma)

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

autogpt_platform/backend/**/*_test.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/**/*_test.py: Use pytest with snapshot testing for API responses; test files should be colocated with source files using the *_test.py naming pattern
Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths
Use AsyncMock from unittest.mock for mocking async functions

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

🧠 Learnings (15)

📓 Common learnings

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:22.025Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12439
File: autogpt_platform/backend/backend/blocks/autogpt_copilot.py:0-0
Timestamp: 2026-03-16T17:00:02.827Z
Learning: In autogpt_platform/backend/backend/blocks/autogpt_copilot.py, the recursion guard uses two module-level ContextVars: `_copilot_recursion_depth` (tracks current nesting depth) and `_copilot_recursion_limit` (stores the chain-wide ceiling). On the first invocation, `_copilot_recursion_limit` is set to `max_recursion_depth`; nested calls use `min(inherited_limit, max_recursion_depth)`, so they can only lower the cap, never raise it. The entry/exit logic is extracted into module-level helper functions. This is the approved pattern for preventing runaway sub-agent recursion in AutogptCopilotBlock (PR `#12439`, commits 348e9f8e2 and 3b70f61b1).

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:297-300
Timestamp: 2026-03-10T08:38:33.249Z
Learning: In autogpt_platform/backend/backend/copilot/tools/run_block.py, the auto-approval key for sensitive block HITL review uses graph_exec_id (copilot-session-{session_id}) + node_id (copilot-node-{block_id}). This is intentional: approving a block type within a CoPilot session auto-approves all future invocations of that same block type within the same session, mirroring how auto-approve works in normal graph execution. The user explicitly opts into this session-scoped behavior via an auto-approve toggle. Without the toggle (default), each individual invocation requires its own approval.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:36.655Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12426
File: autogpt_platform/backend/backend/copilot/sdk/service.py:0-0
Timestamp: 2026-03-15T16:52:15.463Z
Learning: In Significant-Gravitas/AutoGPT (copilot backend), GitHub tokens (GH_TOKEN / GITHUB_TOKEN) for the `gh` CLI are injected lazily per-command in `autogpt_platform/backend/backend/copilot/tools/bash_exec._execute_on_e2b()` by calling `integration_creds.get_integration_env_vars(user_id)`, not on the global SDK subprocess environment in `sdk/service.py`. This scopes credentials to individual E2B sandbox command invocations and prevents token leakage into tool output streams or uploaded transcripts.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12445
File: autogpt_platform/backend/backend/copilot/sdk/service.py:1071-1072
Timestamp: 2026-03-17T06:48:26.471Z
Learning: In Significant-Gravitas/AutoGPT (autogpt_platform), the AI SDK enforces `z.strictObject({type, errorText})` on SSE `StreamError` responses, so additional fields like `retryable: bool` cannot be added to `StreamError` or serialized via `to_sse()`. Instead, retry signaling for transient Anthropic API errors is done via the `COPILOT_RETRYABLE_ERROR_PREFIX` constant prepended to persisted session messages (in `ChatMessage.content`). The frontend detects retryable errors by checking `markerType === "retryable_error"` from `parseSpecialMarkers()` — no SSE schema changes and no string matching on error text. This pattern was established in PR `#12445`, commit 64d82797b.

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-04T12:19:39.243Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12279
File: autogpt_platform/backend/backend/copilot/tools/base.py:184-188
Timestamp: 2026-03-04T12:19:39.243Z
Learning: In autogpt_platform/backend/backend/copilot/tools/, ensure that anonymous users always pass user_id=None to tool execution methods. The anon_ prefix (e.g., anon_123) is used only for PostHog/analytics distinct_id and must not be used as an actual user_id. Use a simple truthiness check on user_id (e.g., if user_id: ... else: ... or a dedicated is_authenticated flag) to distinguish anonymous from authenticated users, and review all tool execution call sites within this directory to prevent accidentally forwarding an anon_ user_id to tools.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py

📚 Learning: 2026-03-31T14:22:26.566Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12622
File: autogpt_platform/backend/backend/copilot/tools/agent_search.py:223-236
Timestamp: 2026-03-31T14:22:26.566Z
Learning: In files under autogpt_platform/backend/backend/copilot/tools/, ensure agent graph enrichment uses the typed Pydantic model `backend.data.graph.Graph` for `AgentInfo.graph` (i.e., `Graph | None`), not `dict[str, Any]`. When enriching with graph data (e.g., `_enrich_agents_with_graph`), prefer calling `graph_db().get_graph(graph_id, version=None, user_id=user_id)` directly to retrieve the typed `Graph` object rather than routing through JSON conversions like `get_agent_as_json()` / `graph_to_json()`.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-16T16:35:40.236Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-31T15:37:38.626Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/manage_folders.py
autogpt_platform/backend/backend/copilot/tools/agent_output.py
autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
autogpt_platform/backend/backend/copilot/tools/feature_requests.py
autogpt_platform/backend/backend/copilot/tools/search_docs.py
autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
autogpt_platform/backend/backend/copilot/tools/find_block.py
autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/tools/agent_browser.py
autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
autogpt_platform/backend/backend/copilot/tools/find_agent.py
autogpt_platform/backend/backend/copilot/tools/workspace_files.py
autogpt_platform/backend/backend/copilot/tools/base.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-02-27T15:59:00.370Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-02-27T15:59:00.370Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Applied to files:

autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py

📚 Learning: 2026-03-17T10:57:12.953Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-25T06:59:27.340Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-03-25T06:59:27.340Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-02-04T16:49:42.490Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/**/test/**/*.py : Use snapshot testing with '--snapshot-update' flag in backend tests when output changes; always review with 'git diff'

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-25T06:59:27.340Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-03-25T06:59:27.340Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Use `AsyncMock` from `unittest.mock` for mocking async functions

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

📚 Learning: 2026-03-19T15:10:53.815Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12483
File: autogpt_platform/backend/backend/copilot/tools/test_dry_run.py:298-303
Timestamp: 2026-03-19T15:10:53.815Z
Learning: In Python unittest.mock, the correct patch target depends on whether an import is eager (module-level) or lazy (inside a function/branch):
- **Module-level import** (`from foo.bar import baz` at top of file): patch where the name is used, e.g. `patch("mymodule.baz")`.
- **Lazy import** (`from foo.bar import baz` inside a function/branch, executed at call time): patch the source module, e.g. `patch("foo.bar.baz")`, because the fresh `from ... import` at call time will look up the (now-patched) name in the source module's dict.
This pattern appears in `autogpt_platform/backend/backend/copilot/tools/helpers.py` where `simulate_block` is lazily imported inside the `if dry_run:` block, making `patch("backend.executor.simulator.simulate_block")` the correct target in tests.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py

🔇 Additional comments (13)

autogpt_platform/backend/backend/copilot/tools/base.py (1)

122-130: LGTM!

Clean implementation of the read_only property with a sensible default (False) and clear documentation explaining the MCP readOnlyHint annotation and parallel dispatch behavior. The opt-in pattern ensures new tools are safe by default.

autogpt_platform/backend/backend/copilot/tools/manage_folders.py (1)

198-200: LGTM!

ListFoldersTool correctly marked as read-only since it only performs list/read operations. The mutating tools (CreateFolderTool, UpdateFolderTool, MoveFolderTool, DeleteFolderTool, MoveAgentsToFolderTool) appropriately inherit the default False.

autogpt_platform/backend/backend/copilot/tools/search_docs.py (1)

56-58: LGTM!

Correctly marked as read-only — the tool only performs documentation searches with no side effects.

autogpt_platform/backend/backend/copilot/tools/get_doc_page.py (1)

44-46: LGTM!

Correctly marked as read-only — the tool only reads documentation file content with no side effects.

autogpt_platform/backend/backend/copilot/tools/feature_requests.py (1)

152-154: LGTM!

SearchFeatureRequestsTool correctly marked as read-only since it only queries Linear for existing issues. The CreateFeatureRequestTool appropriately inherits the default False as it performs mutations.

autogpt_platform/backend/backend/copilot/tools/agent_output.py (1)

163-165: LGTM!

Correctly marked as read-only — the tool only retrieves and views execution outputs without mutations. The wait_for_execution call is a polling/read operation, not a side effect.

autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py (1)

55-57: LGTM!

Correctly marked as read-only — the tool only reads and returns the agent building guide content. The internal module-level caching is an implementation detail, not an external side effect.

autogpt_platform/backend/backend/copilot/tools/find_block.py (1)

81-83: LGTM!

Correctly marked as read-only — the tool only searches for and retrieves block metadata with no side effects.

autogpt_platform/backend/backend/copilot/tools/find_agent.py (1)

36-38: Looks good: FindAgentTool read-only classification is appropriate.

This aligns with the non-mutating behavior of the tool and fits the new parallel dispatch model.

autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py (1)

51-53: GetMCPGuideTool read-only flag is correct.

Good fit for SDK readOnlyHint-based dispatch.

autogpt_platform/backend/backend/copilot/tools/workspace_files.py (1)

445-447: ListWorkspaceFilesTool read-only flag looks correct.

This tool is query-only, so read_only=True is a good fit.

autogpt_platform/backend/backend/copilot/tools/find_library_agent.py (1)

39-41: Read-only classification for FindLibraryAgentTool looks good.

Consistent with this tool’s query-only behavior.

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py (1)

279-333: Nice coverage update for direct-execution semantics.

These tests clearly validate the post–pre-launch behavior and duplicate-execution guard.

majdyz · 2026-03-31T17:34:53Z

E2E Test Report

Date: 2026-03-31 | Branch: fix/copilot-duplicate-block-execution | SDK: claude_agent_sdk 0.1.45

#	Scenario	Result
1	CoPilot basic chat (no regression)	PASS
2	CoPilot tool calls work (find_block x2)	PASS
3	No duplicate tool execution in logs	PASS
4	readOnlyHint parallel dispatch working	PASS

Key Findings

Pre-launch mechanism fully removed: 0 log matches for pre_launch, arg mismatch, cancel_pending, or Preparing block.

readOnlyHint parallel dispatch confirmed: Two find_block calls dispatched 4ms apart (parallel), both completed at ~6s. Without readOnlyHint, these would run sequentially (~12s).

No duplicate execution: Each PostToolUse hook fired exactly once per tool call (2 calls, 2 hooks).

No regressions: CoPilot chat, tool calls, session persistence, and E2B sandbox all function correctly.

Full report: test-results/PR-12632-fix-backend-copilot-prevent-duplicate-block-execution/test-report.md

…dOnlyHint Annotate all MCP tools (including side-effect tools like run_block, bash_exec) with readOnlyHint=True so the SDK CLI dispatches all concurrent tool calls in parallel. E2E verified: 3 bash_exec(sleep 3) calls completed in ~3.3s total (vs ~9s sequential), 3 find_block calls dispatched within 5ms.

majdyz · 2026-03-31T19:28:06Z

E2E Test Report — Parallel Dispatch (all tools readOnlyHint=True)

Test: 3x `find_block` parallel dispatch

Tool inputs (dispatched within 5ms):
  19:23:16.384  find_block("send email")
  19:23:16.386  find_block("slack message")     +2ms
  19:23:16.389  find_block("http request")      +5ms

Tool outputs (completed near-simultaneously):
  19:23:22.348  output: 3,578 chars
  19:23:22.350  output: 3,753 chars              +2ms
  19:23:22.351  output: 3,855 chars              +3ms

Result: 3 tools in ~6s (vs ~18s sequential) — 3x speedup

Test: 3x `bash_exec(sleep 3)` parallel dispatch

Tool inputs (dispatched within 5ms):
  19:25:09.628  bash_exec
  19:25:09.631  bash_exec                        +3ms
  19:25:09.633  bash_exec                        +5ms

Tool completions:
  19:25:12.942  PostToolUse                      ~3.3s total
  19:25:12.999  PostToolUse                      +57ms
  19:25:13.002  PostToolUse                      +60ms

Result: 3x sleep 3 completed in 3.3s (vs 9s sequential) — 2.7x speedup, proving bash_exec (side-effect tool) also runs in parallel with readOnlyHint=True.

Negative checks

pre_launch|arg mismatch|cancel_pending: 0 matches in all logs
Preparing block duplicates: 0
No duplicate tool execution of any kind

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py`:
- Around line 537-542: The code currently marks every E2B file tool as read-only
by always passing _READONLY_ANNOTATION when creating the decorated tool; change
this to consult the _READ_ONLY_E2B_TOOLS set so that writeable tools (e.g.,
write_file, edit_file) are not annotated as read-only. Locate the tool creation
call that uses tool(name, desc, schema,
annotations=_READONLY_ANNOTATION)(_truncating(handler, name)) and modify it to
conditionally include _READONLY_ANNOTATION only when name is present in
_READ_ONLY_E2B_TOOLS (otherwise omit the annotations argument or pass None),
keeping the use of _truncating(handler, name) the same. Ensure the change
references the same symbols: tool, _truncating, handler, name,
_READ_ONLY_E2B_TOOLS, and _READONLY_ANNOTATION so write/edit tools can run
non-parallelizable operations safely.
- Around line 83-86: The _READ_ONLY_E2B_TOOLS frozenset is declared but never
used; update the E2B tool registration code that currently applies
_READONLY_ANNOTATION to all E2B tools unconditionally so it instead checks
membership in _READ_ONLY_E2B_TOOLS and only adds _READONLY_ANNOTATION for tools
whose names are in that set (or, if the intent is to mark all E2B tools
read-only, remove the unused _READ_ONLY_E2B_TOOLS constant). Specifically,
modify the E2B registration block that applies _READONLY_ANNOTATION to consult
_READ_ONLY_E2B_TOOLS when deciding to annotate each tool (referencing
_READ_ONLY_E2B_TOOLS and _READONLY_ANNOTATION to locate the change).
- Around line 450-453: The code currently applies _READONLY_ANNOTATION to every
entry from TOOL_REGISTRY; update the logic that builds the tool annotations (the
loop over TOOL_REGISTRY that adds _READONLY_ANNOTATION at/near where base_tool
is referenced) to only add that annotation when the tool's class property
BaseTool.read_only (accessed via base_tool.read_only) is truthy; in other words,
check base_tool.read_only before applying _READONLY_ANNOTATION so only tools
whose subclass overrides read_only=True (e.g., feature_requests, find_agent,
web_fetch, etc.) get readOnlyHint=True.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d4584f49-d0d8-458c-8952-1ab04ecaf796

📥 Commits

Reviewing files that changed from the base of the PR and between 23cdf29 and 8c45502.

📒 Files selected for processing (1)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: test (3.11)
GitHub Check: end-to-end tests
GitHub Check: get-changed-parts
GitHub Check: setup
GitHub Check: type-check (3.11)
GitHub Check: test (3.13)
GitHub Check: type-check (3.12)
GitHub Check: type-check (3.13)
GitHub Check: lint
GitHub Check: Analyze (python)
GitHub Check: Analyze (typescript)
GitHub Check: check-overlaps
GitHub Check: Seer Code Review

🧰 Additional context used

📓 Path-based instructions (2)

autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Refer to @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks

autogpt_platform/backend/**/*.py: Import only at the top level; no local/inner imports except for lazy imports of heavy optional dependencies like openpyxl
Use absolute imports with from backend.module import ... for cross-package imports; single-dot relative imports (from .sibling import ...) are acceptable for sibling modules within the same package; avoid double-dot relative imports (from ..parent import ...)
Do not use duck typing with hasattr(), getattr(), or isinstance() for type dispatch; use typed interfaces, unions, or protocols instead
Use Pydantic models for structured data instead of dataclasses, namedtuples, or dicts
Do not use linter suppressors; no # type: ignore, # noqa, or # pyright: ignore comments — fix the underlying type/code issue instead
Use list comprehensions instead of manual loop-and-append patterns
Use early return guard clauses to avoid deep nesting
Use %s for deferred interpolation in debug log statements; use f-strings for readability in other log levels (e.g., logger.debug("Processing %s items", count), logger.info(f"Processing {count} items"))
Sanitize error paths using os.path.basename() in error messages to avoid leaking directory structure
Avoid TOCTOU (time-of-check-time-of-use) patterns; do not use check-then-act patterns for file access and credit charging operations
Use Redis pipelines with transaction=True for atomicity on multi-step Redis operations
Use max(0, value) guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract h...

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

🧠 Learnings (8)

📓 Common learnings

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:22.025Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12439
File: autogpt_platform/backend/backend/blocks/autogpt_copilot.py:0-0
Timestamp: 2026-03-16T17:00:02.827Z
Learning: In autogpt_platform/backend/backend/blocks/autogpt_copilot.py, the recursion guard uses two module-level ContextVars: `_copilot_recursion_depth` (tracks current nesting depth) and `_copilot_recursion_limit` (stores the chain-wide ceiling). On the first invocation, `_copilot_recursion_limit` is set to `max_recursion_depth`; nested calls use `min(inherited_limit, max_recursion_depth)`, so they can only lower the cap, never raise it. The entry/exit logic is extracted into module-level helper functions. This is the approved pattern for preventing runaway sub-agent recursion in AutogptCopilotBlock (PR `#12439`, commits 348e9f8e2 and 3b70f61b1).

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:297-300
Timestamp: 2026-03-10T08:38:33.249Z
Learning: In autogpt_platform/backend/backend/copilot/tools/run_block.py, the auto-approval key for sensitive block HITL review uses graph_exec_id (copilot-session-{session_id}) + node_id (copilot-node-{block_id}). This is intentional: approving a block type within a CoPilot session auto-approves all future invocations of that same block type within the same session, mirroring how auto-approve works in normal graph execution. The user explicitly opts into this session-scoped behavior via an auto-approve toggle. Without the toggle (default), each individual invocation requires its own approval.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:36.655Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: Abhi1992002
Repo: Significant-Gravitas/AutoGPT PR: 12417
File: autogpt_platform/backend/backend/blocks/agent_mail/threads.py:80-102
Timestamp: 2026-03-16T16:30:20.657Z
Learning: In autogpt_platform/backend/backend/blocks/agent_mail/ blocks (and across the codebase), wrapping synchronous AgentMail SDK calls with `await asyncio.to_thread()` is NOT required. The block executor runs node execution in dedicated threads via `asyncio.run_coroutine_threadsafe` (manager.py lines ~745-752, ~1079), and the existing codebase pattern does not use `asyncio.to_thread` for SDK calls inside async `run()` methods.

📚 Learning: 2026-03-17T10:57:12.953Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📚 Learning: 2026-03-04T12:19:43.066Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12279
File: autogpt_platform/backend/backend/copilot/tools/base.py:184-188
Timestamp: 2026-03-04T12:19:43.066Z
Learning: In the AutoGPT Copilot backend (autogpt_platform/backend/backend/copilot/tools/), anonymous users always have user_id=None when passed to tool execution methods. The "anon_" prefix (e.g., "anon_123") is only used for PostHog/analytics tracking distinct_id and is never used as an actual user_id passed to tools. A simple truthiness check (`and user_id`) is sufficient to distinguish anonymous from authenticated users.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📚 Learning: 2026-02-26T17:02:22.448Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📚 Learning: 2026-03-04T08:04:35.881Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📚 Learning: 2026-03-05T15:42:08.207Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📚 Learning: 2026-03-16T16:35:40.236Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

📚 Learning: 2026-03-31T15:37:38.626Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.

Applied to files:

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

🔇 Additional comments (3)

autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py (3)

17-17: LGTM — new import for ToolAnnotations.

The import from mcp.types is appropriate for the new readOnlyHint annotation functionality.

444-445: LGTM — readonly annotation constant.

Extracting ToolAnnotations(readOnlyHint=True) as a module-level constant (_READONLY_ANNOTATION) avoids repeated allocations and clearly documents intent.

545-551: LGTM — Read tool marked as read-only.

The Read tool is correctly annotated with readOnlyHint=True since reading files has no side effects.

majdyz · 2026-03-31T19:39:17Z

/review

autogpt-pr-reviewer · 2026-03-31T19:39:27Z

Queued a review for PR #12632 at 8c45502.

…d _READ_ONLY_E2B_TOOLS Address review: since all tools now get readOnlyHint=True unconditionally, the BaseTool.read_only property and 15 tool class overrides were dead code. Remove them along with _READ_ONLY_E2B_TOOLS. Fix docstring to match.

majdyz · 2026-03-31T20:33:13Z

🤖 🔵 Nit (Round 2): Comment in service.py line 1296 says "parallel execution of read-only tools" but all tools now have readOnlyHint. Should say "parallel execution of all tools".

majdyz · 2026-03-31T20:34:01Z

🤖 🔵 Nit (Round 3): _READONLY_ANNOTATION is misleading now that it's applied to ALL tools (including write/side-effect tools). Rename to _PARALLEL_ANNOTATION to reflect its actual purpose: enabling parallel dispatch.

…NNOTATION Address review nits: rename misleading constant (all tools get it, not just read-only), fix service.py comment to say "all tools".

github-actions · 2026-03-31T21:20:57Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions · 2026-03-31T21:20:57Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Keep readOnlyHint test, add SDK_DISALLOWED_TOOLS tests from dev.

github-actions · 2026-04-01T04:17:02Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

majdyz · 2026-04-01T04:21:19Z

/review

autogpt-pr-reviewer · 2026-04-01T04:21:36Z

Queued a review for PR #12632 at cc4cb29.

Reproduce the three production bugs that the pre-launch speculative execution mechanism caused, verifying the current direct-execution handler is free of them: Bug 1 (SECRT-2204): Duplicate execution on arg mismatch - test_single_execution_even_with_different_arg_representations - test_concurrent_calls_each_execute_once Bug 2: FIFO desync when security hook denies a tool - test_skipped_call_does_not_affect_subsequent_calls - test_handler_has_no_shared_queue_state Bug 3: Cancel race condition (task completes before cancel) - test_no_speculative_execution_before_handler_called - test_failed_execution_does_not_leave_orphaned_tasks

Each bug has two tests: one that reproduces the old buggy pre-launch behavior (xfail — proves the bug exists) and one that verifies the current clean handler is free of it (pass). Bug 1 (SECRT-2204) — duplicate execution on arg mismatch Bug 2 — FIFO desync when tool denied by security hook Bug 3 — cancel race (task completes before cancel arrives)

autogpt-pr-reviewer · 2026-04-01T04:57:14Z

⚠️ Code review could not be completed

The review could not start because a setup step failed (e.g., dependency installation, repo clone). This is usually a repository configuration issue — check that your lock files are up to date and CI passes.

If this persists, please contact support with job ID b0b09272-85dc-4076-9b0d-c3ece66962eb.

Details: git clone failed (exit_code=1): Cloning into '/home/user/repo'... Updating files: 84% (4220/5021) Updating files: 85% (4268/5021) Updating files: 86% (4319/5021) Updating files: 87% (4369/5021) Updating files: 88% (4419/5021) Updating files: 89% (4469/5021) Updating files: 90% (4519/5021) Updating files: 91% (4570/5021) Updating files: 92% (4620/5021) Updating files: 93% (4670/5021) Updating files: 94% (4720/5021) Updating files: 95% (4770/5021) Updating files: 96% (4821/5021) Updat

majdyz · 2026-04-01T05:11:06Z

/review

autogpt-pr-reviewer · 2026-04-01T05:11:24Z

Queued a review for PR #12632 at 0e2ae0d.

…e annotation test Add docstring note explaining why side-effect tools use readOnlyHint=True (deliberate override to avoid pre-launch duplicate-execution bug). Replace trivial ToolAnnotations construction test with assertion against the actual _PARALLEL_ANNOTATION constant.

autogpt-pr-reviewer

4/8 done (security, architect, performance, discussion). Testing in progress. Quality, product, UI-reviewer still queued. Waiting 3 more minutes.

majdyz

Test Report (Round 2) — Local Verification

Branch: fix/copilot-duplicate-block-execution | Worktree: AutoGPT3

Unit Tests

poetry run pytest backend/copilot/sdk/tool_adapter_test.py -x -q
30 passed, 3 xfailed in 20.56s

All 30 tests pass. The 3 xfail tests are regression tests that reproduce the three old bugs inline and confirm they fail as expected.

Code Review Checklist

Check	Result
Pre-launch infrastructure fully removed from production code	✅
`readOnlyHint=True` annotation applied to all tools (TOOL_REGISTRY, E2B, Read)	✅
No dangling references to removed functions (`pre_launch_tool_call`, `cancel_pending_tool_tasks`, `_tool_task_queues`)	✅
All 4 `cancel_pending_tool_tasks()` calls removed from `service.py`	✅
`service.py` imports clean (no removed symbols)	✅
Python import verification passes	✅
`tool_handler()` executes exactly once per call (no speculative execution)	✅
Regression tests prove all 3 bugs (arg mismatch, FIFO desync, cancel race)	✅

Verdict

PASS — Clean removal of pre-launch mechanism (-578 lines), replaced with SDK-native parallel dispatch via readOnlyHint annotations (+105 lines). Comprehensive regression tests included.

Abhi1992002

LGTM

Codewise wise looks good
Tested locally as well - tools are running in parallel perfectly

majdyz requested a review from a team as a code owner March 31, 2026 16:37

majdyz requested review from 0ubbe and ntindle and removed request for a team March 31, 2026 16:37

github-project-automation Bot added this to AutoGPT development kanban Mar 31, 2026

github-project-automation Bot moved this to 🆕 Needs initial review in AutoGPT development kanban Mar 31, 2026

github-actions Bot added platform/backend AutoGPT Platform - Back end size/l labels Mar 31, 2026

majdyz commented Mar 31, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated

majdyz commented Mar 31, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated

coderabbitai Bot reviewed Mar 31, 2026

View reviewed changes

github-actions Bot added the size/xl label Mar 31, 2026

majdyz commented Mar 31, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated

coderabbitai Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/tools/agent_browser.py Outdated

Comment thread autogpt_platform/backend/backend/copilot/tools/workspace_files.py Outdated

github-actions Bot mentioned this pull request Mar 31, 2026

feat(platform): add copilot artifact preview panel #12629

Merged

10 tasks

sentry Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

coderabbitai Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py

refactor(backend/copilot): rename _READONLY_ANNOTATION to _PARALLEL_A…

c88ca88

…NNOTATION Address review nits: rename misleading constant (all tools get it, not just read-only), fix service.py comment to say "all tools".

github-actions Bot added the conflicts Automatically applied to PRs with merge conflicts label Mar 31, 2026

fix: resolve merge conflict in tool_adapter_test.py

cc4cb29

Keep readOnlyHint test, add SDK_DISALLOWED_TOOLS tests from dev.

github-actions Bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 1, 2026

majdyz added 2 commits April 1, 2026 06:32

autogpt-pr-reviewer Bot reviewed Apr 1, 2026

View reviewed changes

majdyz commented Apr 1, 2026

View reviewed changes

Abhi1992002 self-requested a review April 1, 2026 13:12

Abhi1992002 approved these changes Apr 1, 2026

View reviewed changes

github-project-automation Bot moved this from 🆕 Needs initial review to 👍🏼 Mergeable in AutoGPT development kanban Apr 1, 2026

majdyz added this pull request to the merge queue Apr 1, 2026

Merged via the queue into dev with commit 8aae775 Apr 1, 2026
24 checks passed

majdyz deleted the fix/copilot-duplicate-block-execution branch April 1, 2026 13:56

github-project-automation Bot moved this from 👍🏼 Mergeable to ✅ Done in AutoGPT development kanban Apr 1, 2026

majdyz mentioned this pull request Apr 3, 2026

fix(copilot): prevent duplicate side effects from double-submit and stale-cache race #12660

Merged

4 tasks

coderabbitai Bot mentioned this pull request Apr 17, 2026

refactor(backend/copilot): unified queue-backed copilot turns + async sub-AutoPilot + guide-read gate #12841

Merged

8 tasks

Conversation

majdyz commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

How

Root cause

Fix

Net change: -578 lines, +105 lines

Uh oh!

Uh oh!

coderabbitai Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

majdyz commented Mar 31, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

majdyz commented Mar 31, 2026

Uh oh!

majdyz commented Mar 31, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

majdyz commented Mar 31, 2026

E2E Test Report

Key Findings

Uh oh!

majdyz commented Mar 31, 2026

E2E Test Report — Parallel Dispatch (all tools readOnlyHint=True)

Test: 3x find_block parallel dispatch

Test: 3x bash_exec(sleep 3) parallel dispatch

Negative checks

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

majdyz commented Mar 31, 2026

Uh oh!

autogpt-pr-reviewer Bot commented Mar 31, 2026

Uh oh!

majdyz commented Mar 31, 2026

Uh oh!

majdyz commented Mar 31, 2026

Uh oh!

github-actions Bot commented Mar 31, 2026

Uh oh!

github-actions Bot commented Mar 31, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

majdyz commented Apr 1, 2026

Uh oh!

autogpt-pr-reviewer Bot commented Apr 1, 2026

Uh oh!

autogpt-pr-reviewer Bot commented Apr 1, 2026

Uh oh!

majdyz commented Apr 1, 2026

Uh oh!

autogpt-pr-reviewer Bot commented Apr 1, 2026

Uh oh!

majdyz commented Mar 31, 2026 •

edited

Loading

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading

Test: 3x `find_block` parallel dispatch

Test: 3x `bash_exec(sleep 3)` parallel dispatch