Skip to content

fix(backend/copilot): prevent duplicate block execution from pre-launch arg mismatch#12632

Merged
majdyz merged 11 commits into
devfrom
fix/copilot-duplicate-block-execution
Apr 1, 2026
Merged

fix(backend/copilot): prevent duplicate block execution from pre-launch arg mismatch#12632
majdyz merged 11 commits into
devfrom
fix/copilot-duplicate-block-execution

Conversation

@majdyz
Copy link
Copy Markdown
Contributor

@majdyz majdyz commented Mar 31, 2026

Why

CoPilot sessions are duplicating Linear tickets and GitHub PRs. Investigation of 5 production sessions (March 31st) found that 3/5 created duplicate Linear issues — each with consecutive IDs at the exact same timestamp, but only one visible in Langfuse traces.

Production gcloud logs confirm: 279 arg mismatch warnings per day, 37 duplicate block execution pairs, and all LinearCreateIssueBlock failures in pairs.

Related: SECRT-2204

What

Replace the speculative pre-launch mechanism with the SDK's native parallel dispatch via readOnlyHint tool annotations. Remove ~580 lines of pre-launch infrastructure code.

How

Root cause

The pre-launch mechanism had three compounding bugs:

  1. Arg mismatch: The SDK CLI normalises args between the AssistantMessage (used for pre-launch) and the MCP tools/call dispatch, causing frequent mismatches (279/day in prod)
  2. FIFO desync on denial: Security hooks can deny tool calls, causing the CLI to skip the MCP dispatch — but the pre-launched task stays in the FIFO queue, misaligning all subsequent matches
  3. Cancel race: task.cancel() is best-effort in asyncio — if the HTTP call to Linear/GitHub already completed, the side effect is irreversible

Fix

  • Removed pre_launch_tool_call(), cancel_pending_tool_tasks(), _tool_task_queues ContextVar, all FIFO queue logic, and all 4 cancel_pending_tool_tasks() calls in service.py
  • Added readOnlyHint=True annotations on 15+ read-only tools (find_block, search_docs, list_workspace_files, etc.) — the SDK CLI natively dispatches these in parallel (ref: anthropics/claude-code#14353)
  • Side-effect tools (run_block, bash_exec, create_agent, etc.) have no annotation → CLI runs them sequentially → no duplicate execution risk

Net change: -578 lines, +105 lines

…ch arg mismatch

The pre-launch mechanism speculatively starts tool execution when an
AssistantMessage arrives, then matches results to SDK MCP dispatches
via FIFO queue. A strict arg equality check discarded pre-launched
results when the SDK CLI normalised args (e.g. injecting schema
defaults), causing duplicate execution of blocks with side effects
like LinearCreateIssueBlock and GithubCreatePullRequestBlock.

Trust FIFO ordering instead — the SDK dispatches MCP tool calls in the
same order as ToolUseBlocks in the AssistantMessage. Log at debug level
when args differ for observability.
@majdyz majdyz requested a review from a team as a code owner March 31, 2026 16:37
@majdyz majdyz requested review from 0ubbe and ntindle and removed request for a team March 31, 2026 16:37
@github-project-automation github-project-automation Bot moved this to 🆕 Needs initial review in AutoGPT development kanban Mar 31, 2026
@github-actions github-actions Bot added platform/backend AutoGPT Platform - Back end size/l labels Mar 31, 2026
Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 31, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Removed speculative per-session tool pre-launch/queueing; tool handler now always invokes synchronous execution path. Added read-only annotations (ToolAnnotations(readOnlyHint=True) / BaseTool.read_only) for selected tools and SDK Read; service stops cancelling previously pre-launched tasks and no longer pre-launches per-tool.

Changes

Cohort / File(s) Summary
Tool adapter logic
autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
Removed ContextVar _tool_task_queues, set_execution_context, pre_launch_tool_call, cancel_pending_tool_tasks. create_tool_handler().tool_handler no longer dequeues/validates pre-launched tasks and always calls _execute_tool_sync(...). Added _READONLY_ANNOTATION and _READ_ONLY_E2B_TOOLS and register readOnly annotations for applicable tools (including SDK Read and E2B tools).
Service coordination
autogpt_platform/backend/backend/copilot/sdk/service.py
Removed imports and calls to cancel_pending_tool_tasks(); stopped per-tool pre-launching on incoming AssistantMessage. Retained is_tool_only computation for flush/stash-wait logic; updated comments to reflect delegation of parallel-read behavior to SDK annotations.
Tool base and tools
autogpt_platform/backend/backend/copilot/tools/base.py, .../tools/*
Added BaseTool.read_only property (defaults to False). Many tools (browser screenshot, agent output, search/docs, find_, get_, web_fetch, workspace readers, etc.) now expose read_only properties returning True to mark them non-mutating.
Tests
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
Removed tests covering pre-launch/parallel-dispatch and related helpers; narrowed tests to direct execution semantics (single run, exception → MCP error, missing session). Added assertions verifying read-only classification and _READ_ONLY_E2B_TOOLS contents.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Service
    participant ToolAdapter as CopilotAdapter
    participant Executor as ToolExecutor

    Client->>Service: send assistant message / tool request
    Service->>ToolAdapter: route to tool handler
    ToolAdapter->>ToolAdapter: determine tool and annotations (readOnlyHint)
    ToolAdapter->>Executor: call _execute_tool_sync(...)
    Executor-->>ToolAdapter: return result / raise exception
    ToolAdapter-->>Service: return MCP result or _mcp_error
    Service-->>Client: deliver response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • 0ubbe
  • ntindle
  • Pwuts
  • Swiftyos

Poem

🐰 I shelved the hopeful pre-launch race,
Tasks now meet their moment, face to face,
Read-only tools wear a gentle sign,
Ready for parallel, but by design,
I hop along, content with steadier pace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: removing the pre-launch mechanism to prevent duplicate block execution caused by arg mismatch, which aligns with the core purpose of the PR.
Description check ✅ Passed The description clearly explains the why (duplicate production issues), what (replace pre-launch with readOnlyHint annotations), and how (root causes and fixes), directly corresponding to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/copilot-duplicate-block-execution

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

🤖 🔵 Nit: The pre_launch_tool_call docstring (lines 282-288) still references the (task, args) tuple pattern and ordering mismatch detection. Should be updated to reflect the simpler task-only queue — the handler no longer checks args ordering.

Address self-review: remove the dead arg comparison and simplify the
queue from (task, args) tuples to plain tasks.  The handler trusts
FIFO ordering unconditionally — no args are stored or compared.
Update docstring and type alias accordingly.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py (1)

522-544: Make this test prove which args actually executed.

With a constant mock result, this still passes if the handler skips the queued task and performs one direct call with "normalised" args. Make the return value depend on block_id, or assert mock_tool.execute.await_args.kwargs["block_id"] == "original", so the test really locks in the pre-launched path.

💡 One way to pin the pre-launched call
-        mock_tool = _make_mock_tool("run_block", output="pre-launched-result")
+        async def execute_with_block_id(*_args, **kwargs):
+            block_id = kwargs["block_id"]
+            return StreamToolOutputAvailable(
+                toolCallId="test-id",
+                output=f"result-for-{block_id}",
+                toolName="run_block",
+                success=True,
+            )
+
+        mock_tool = _make_mock_tool("run_block")
+        mock_tool.execute = AsyncMock(side_effect=execute_with_block_id)
@@
-        assert "pre-launched-result" in result["content"][0]["text"]
+        assert "result-for-original" in result["content"][0]["text"]
+        assert mock_tool.execute.await_args.kwargs["block_id"] == "original"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py` around
lines 522 - 544, Update the test_arg_mismatch_uses_prelaunched_result test so it
proves the pre-launched args were actually used: make the mock_tool's return
depend on the incoming block_id (so "original" returns "pre-launched-result" and
"normalised" would return something else) or add an assertion that inspects the
executed call on mock_tool (e.g., check
mock_tool.execute.await_args.kwargs["block_id"] == "original") after running
handler; keep references to the existing helpers (pre_launch_tool_call and
create_tool_handler) and ensure the final assertions still verify only one
execution occurred (mock_tool.execute.await_count == 1) and the observed result
matches the "original" block_id outcome.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py`:
- Around line 522-544: Update the test_arg_mismatch_uses_prelaunched_result test
so it proves the pre-launched args were actually used: make the mock_tool's
return depend on the incoming block_id (so "original" returns
"pre-launched-result" and "normalised" would return something else) or add an
assertion that inspects the executed call on mock_tool (e.g., check
mock_tool.execute.await_args.kwargs["block_id"] == "original") after running
handler; keep references to the existing helpers (pre_launch_tool_call and
create_tool_handler) and ensure the final assertions still verify only one
execution occurred (mock_tool.execute.await_count == 1) and the observed result
matches the "original" block_id outcome.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 58649ac0-1d79-41f1-8bcd-c95ab83b513b

📥 Commits

Reviewing files that changed from the base of the PR and between 57b17dc and 959c539.

📒 Files selected for processing (2)
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: check API types
  • GitHub Check: Seer Code Review
  • GitHub Check: type-check (3.13)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.12)
  • GitHub Check: conflicts
  • GitHub Check: end-to-end tests
  • GitHub Check: Analyze (python)
  • GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Refer to @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks

autogpt_platform/backend/**/*.py: Import only at the top level; no local/inner imports except for lazy imports of heavy optional dependencies like openpyxl
Use absolute imports with from backend.module import ... for cross-package imports; single-dot relative imports (from .sibling import ...) are acceptable for sibling modules within the same package; avoid double-dot relative imports (from ..parent import ...)
Do not use duck typing with hasattr(), getattr(), or isinstance() for type dispatch; use typed interfaces, unions, or protocols instead
Use Pydantic models for structured data instead of dataclasses, namedtuples, or dicts
Do not use linter suppressors; no # type: ignore, # noqa, or # pyright: ignore comments — fix the underlying type/code issue instead
Use list comprehensions instead of manual loop-and-append patterns
Use early return guard clauses to avoid deep nesting
Use %s for deferred interpolation in debug log statements; use f-strings for readability in other log levels (e.g., logger.debug("Processing %s items", count), logger.info(f"Processing {count} items"))
Sanitize error paths using os.path.basename() in error messages to avoid leaking directory structure
Avoid TOCTOU (time-of-check-time-of-use) patterns; do not use check-then-act patterns for file access and credit charging operations
Use Redis pipelines with transaction=True for atomicity on multi-step Redis operations
Use max(0, value) guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract h...

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/backend/**/*test*.py

📄 CodeRabbit inference engine (AGENTS.md)

Run poetry run test for backend testing (runs pytest with docker based postgres + prisma)

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/backend/**/*_test.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/**/*_test.py: Use pytest with snapshot testing for API responses; test files should be colocated with source files using the *_test.py naming pattern
Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths
Use AsyncMock from unittest.mock for mocking async functions

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:22.025Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12439
File: autogpt_platform/backend/backend/blocks/autogpt_copilot.py:0-0
Timestamp: 2026-03-16T17:00:02.827Z
Learning: In autogpt_platform/backend/backend/blocks/autogpt_copilot.py, the recursion guard uses two module-level ContextVars: `_copilot_recursion_depth` (tracks current nesting depth) and `_copilot_recursion_limit` (stores the chain-wide ceiling). On the first invocation, `_copilot_recursion_limit` is set to `max_recursion_depth`; nested calls use `min(inherited_limit, max_recursion_depth)`, so they can only lower the cap, never raise it. The entry/exit logic is extracted into module-level helper functions. This is the approved pattern for preventing runaway sub-agent recursion in AutogptCopilotBlock (PR `#12439`, commits 348e9f8e2 and 3b70f61b1).
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:297-300
Timestamp: 2026-03-10T08:38:33.249Z
Learning: In autogpt_platform/backend/backend/copilot/tools/run_block.py, the auto-approval key for sensitive block HITL review uses graph_exec_id (copilot-session-{session_id}) + node_id (copilot-node-{block_id}). This is intentional: approving a block type within a CoPilot session auto-approves all future invocations of that same block type within the same session, mirroring how auto-approve works in normal graph execution. The user explicitly opts into this session-scoped behavior via an auto-approve toggle. Without the toggle (default), each individual invocation requires its own approval.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:36.655Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.
📚 Learning: 2026-03-17T10:57:12.953Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-16T16:35:40.236Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-31T15:37:38.626Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-25T06:59:27.340Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-03-25T06:59:27.340Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
🔇 Additional comments (1)
autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py (1)

467-488: Drop the debug-only arg deep-compare.

Now that mismatches no longer change control flow, launch_args != args only keeps the old cost of retaining and deep-comparing the whole payload. Logging that a queued task was consumed is enough here, and it would let _TaskQueueItem stop carrying large run_block arg dicts around.

…annotations

Remove the speculative pre-launch mechanism that caused duplicate block
execution (e.g. duplicate Linear tickets, GitHub PRs) due to:
- Arg mismatch between AssistantMessage and SDK MCP dispatch
- Broken FIFO assumption when security hooks deny a tool
- Race condition where task.cancel() arrives after API call completes

Instead, use the SDK's native parallel dispatch: annotate read-only
tools with ToolAnnotations(readOnlyHint=True) so the CLI dispatches
them concurrently.  Side-effect tools (run_block, bash_exec, etc.)
run sequentially — correct and safe.

Removed:
- pre_launch_tool_call(), cancel_pending_tool_tasks()
- _tool_task_queues ContextVar and all queue logic
- Pre-launch calls in service.py streaming loop
- All pre-launch tests (replaced with annotation tests)

Added:
- _READ_ONLY_TOOLS / _READ_ONLY_E2B_TOOLS sets
- ToolAnnotations(readOnlyHint=True) on 15+ read-only tools
- Tests for annotation classification
Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

🤖 🔵 Nit (Round 2): The _READ_ONLY_E2B_TOOLS set remains hardcoded since E2B file tools don't use BaseTool. Add a comment explaining this asymmetry so future maintainers don't wonder why E2B tools aren't derived like registry tools.

@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

🤖 🔵 Nit (Round 3): The create_copilot_mcp_server docstring should mention that readOnlyHint is now derived from BaseTool.read_only (not a hardcoded set), so developers know to set the property on new tools rather than editing a separate list.

…y property

Address review: instead of a hardcoded _READ_ONLY_TOOLS frozenset,
add a read_only property to BaseTool (default False) and override it
to True on 15 read-only tool classes.  create_copilot_mcp_server
now derives annotations from base_tool.read_only so new tools only
need to set the property, not edit a separate list.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py (1)

358-376: Add regression assertions for write-capable tools in read-only classification tests.

Given the new readOnlyHint path, this suite should also assert that write-capable tools (notably read_workspace_file with save_to_path, and browser_screenshot) are not treated as read-only.

Possible test extension
     def test_side_effect_tools_are_not_read_only(self):
         """Key side-effect tools should have read_only=False."""
         from backend.copilot.tools import TOOL_REGISTRY

-        for name in ["run_block", "bash_exec", "create_agent", "run_agent"]:
+        for name in [
+            "run_block",
+            "bash_exec",
+            "create_agent",
+            "run_agent",
+            "read_workspace_file",
+            "browser_screenshot",
+        ]:
             if name in TOOL_REGISTRY:
                 assert not TOOL_REGISTRY[
                     name
                 ].read_only, f"{name} should not be read_only"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py` around
lines 358 - 376, Add assertions to the read-only classification test to ensure
write-capable tools are not misclassified: update
test_read_only_e2b_tools_classification (or the surrounding test block
referencing _READ_ONLY_E2B_TOOLS and TOOL_REGISTRY) to assert that
"read_workspace_file" (when used with save_to_path capability) and
"browser_screenshot" are NOT in _READ_ONLY_E2B_TOOLS and, if present in
TOOL_REGISTRY, that their ToolAnnotations/readOnlyHint or tool.read_only is
False; use the existing symbols _READ_ONLY_E2B_TOOLS, TOOL_REGISTRY,
ToolAnnotations and the tool names "read_workspace_file" and
"browser_screenshot" to locate and add these negative assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/tools/agent_browser.py`:
- Around line 768-770: The BrowserScreenshotTool is marked read_only via the
read_only property returning True while it performs a side-effect (writes a file
using WriteWorkspaceFileTool); update the read_only property on the
BrowserScreenshotTool class to return False (or remove the read_only override so
it inherits a non-read-only default) so the tool is not classified as read-only,
and verify any callers/registrations relying on read-only behavior are adjusted
accordingly; locate the read_only property in BrowserScreenshotTool and change
its return value and/or documentation to reflect that it mutates workspace state
via WriteWorkspaceFileTool.

In `@autogpt_platform/backend/backend/copilot/tools/workspace_files.py`:
- Around line 566-568: The ReadWorkspaceFileTool is incorrectly marked read-only
which allows its mutating method _save_to_path (invoked via save_to_path) to run
in parallel; change the read_only property on the ReadWorkspaceFileTool class so
it returns False (i.e., clear that this tool is not read-only) to prevent
parallel execution of its mutating operation and ensure callers treat it as a
writer rather than a read-only task.

---

Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py`:
- Around line 358-376: Add assertions to the read-only classification test to
ensure write-capable tools are not misclassified: update
test_read_only_e2b_tools_classification (or the surrounding test block
referencing _READ_ONLY_E2B_TOOLS and TOOL_REGISTRY) to assert that
"read_workspace_file" (when used with save_to_path capability) and
"browser_screenshot" are NOT in _READ_ONLY_E2B_TOOLS and, if present in
TOOL_REGISTRY, that their ToolAnnotations/readOnlyHint or tool.read_only is
False; use the existing symbols _READ_ONLY_E2B_TOOLS, TOOL_REGISTRY,
ToolAnnotations and the tool names "read_workspace_file" and
"browser_screenshot" to locate and add these negative assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 247a39ef-eb95-486b-9603-01d0bac1af58

📥 Commits

Reviewing files that changed from the base of the PR and between 4cb138f and 23cdf29.

📒 Files selected for processing (17)
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/validate_agent.py
  • autogpt_platform/backend/backend/copilot/tools/web_fetch.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
✅ Files skipped from review due to trivial changes (2)
  • autogpt_platform/backend/backend/copilot/tools/validate_agent.py
  • autogpt_platform/backend/backend/copilot/tools/web_fetch.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: check API types
  • GitHub Check: end-to-end tests
  • GitHub Check: Analyze (typescript)
  • GitHub Check: Analyze (python)
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.11)
  • GitHub Check: type-check (3.13)
  • GitHub Check: test (3.12)
  • GitHub Check: type-check (3.11)
  • GitHub Check: Check PR Status
  • GitHub Check: Seer Code Review
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Refer to @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks

autogpt_platform/backend/**/*.py: Import only at the top level; no local/inner imports except for lazy imports of heavy optional dependencies like openpyxl
Use absolute imports with from backend.module import ... for cross-package imports; single-dot relative imports (from .sibling import ...) are acceptable for sibling modules within the same package; avoid double-dot relative imports (from ..parent import ...)
Do not use duck typing with hasattr(), getattr(), or isinstance() for type dispatch; use typed interfaces, unions, or protocols instead
Use Pydantic models for structured data instead of dataclasses, namedtuples, or dicts
Do not use linter suppressors; no # type: ignore, # noqa, or # pyright: ignore comments — fix the underlying type/code issue instead
Use list comprehensions instead of manual loop-and-append patterns
Use early return guard clauses to avoid deep nesting
Use %s for deferred interpolation in debug log statements; use f-strings for readability in other log levels (e.g., logger.debug("Processing %s items", count), logger.info(f"Processing {count} items"))
Sanitize error paths using os.path.basename() in error messages to avoid leaking directory structure
Avoid TOCTOU (time-of-check-time-of-use) patterns; do not use check-then-act patterns for file access and credit charging operations
Use Redis pipelines with transaction=True for atomicity on multi-step Redis operations
Use max(0, value) guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract h...

Files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/backend/**/*test*.py

📄 CodeRabbit inference engine (AGENTS.md)

Run poetry run test for backend testing (runs pytest with docker based postgres + prisma)

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
autogpt_platform/backend/**/*_test.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/**/*_test.py: Use pytest with snapshot testing for API responses; test files should be colocated with source files using the *_test.py naming pattern
Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths
Use AsyncMock from unittest.mock for mocking async functions

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
🧠 Learnings (15)
📓 Common learnings
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:22.025Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12439
File: autogpt_platform/backend/backend/blocks/autogpt_copilot.py:0-0
Timestamp: 2026-03-16T17:00:02.827Z
Learning: In autogpt_platform/backend/backend/blocks/autogpt_copilot.py, the recursion guard uses two module-level ContextVars: `_copilot_recursion_depth` (tracks current nesting depth) and `_copilot_recursion_limit` (stores the chain-wide ceiling). On the first invocation, `_copilot_recursion_limit` is set to `max_recursion_depth`; nested calls use `min(inherited_limit, max_recursion_depth)`, so they can only lower the cap, never raise it. The entry/exit logic is extracted into module-level helper functions. This is the approved pattern for preventing runaway sub-agent recursion in AutogptCopilotBlock (PR `#12439`, commits 348e9f8e2 and 3b70f61b1).
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:297-300
Timestamp: 2026-03-10T08:38:33.249Z
Learning: In autogpt_platform/backend/backend/copilot/tools/run_block.py, the auto-approval key for sensitive block HITL review uses graph_exec_id (copilot-session-{session_id}) + node_id (copilot-node-{block_id}). This is intentional: approving a block type within a CoPilot session auto-approves all future invocations of that same block type within the same session, mirroring how auto-approve works in normal graph execution. The user explicitly opts into this session-scoped behavior via an auto-approve toggle. Without the toggle (default), each individual invocation requires its own approval.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:36.655Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12426
File: autogpt_platform/backend/backend/copilot/sdk/service.py:0-0
Timestamp: 2026-03-15T16:52:15.463Z
Learning: In Significant-Gravitas/AutoGPT (copilot backend), GitHub tokens (GH_TOKEN / GITHUB_TOKEN) for the `gh` CLI are injected lazily per-command in `autogpt_platform/backend/backend/copilot/tools/bash_exec._execute_on_e2b()` by calling `integration_creds.get_integration_env_vars(user_id)`, not on the global SDK subprocess environment in `sdk/service.py`. This scopes credentials to individual E2B sandbox command invocations and prevents token leakage into tool output streams or uploaded transcripts.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12445
File: autogpt_platform/backend/backend/copilot/sdk/service.py:1071-1072
Timestamp: 2026-03-17T06:48:26.471Z
Learning: In Significant-Gravitas/AutoGPT (autogpt_platform), the AI SDK enforces `z.strictObject({type, errorText})` on SSE `StreamError` responses, so additional fields like `retryable: bool` cannot be added to `StreamError` or serialized via `to_sse()`. Instead, retry signaling for transient Anthropic API errors is done via the `COPILOT_RETRYABLE_ERROR_PREFIX` constant prepended to persisted session messages (in `ChatMessage.content`). The frontend detects retryable errors by checking `markerType === "retryable_error"` from `parseSpecialMarkers()` — no SSE schema changes and no string matching on error text. This pattern was established in PR `#12445`, commit 64d82797b.
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-04T12:19:39.243Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12279
File: autogpt_platform/backend/backend/copilot/tools/base.py:184-188
Timestamp: 2026-03-04T12:19:39.243Z
Learning: In autogpt_platform/backend/backend/copilot/tools/, ensure that anonymous users always pass user_id=None to tool execution methods. The anon_ prefix (e.g., anon_123) is used only for PostHog/analytics distinct_id and must not be used as an actual user_id. Use a simple truthiness check on user_id (e.g., if user_id: ... else: ... or a dedicated is_authenticated flag) to distinguish anonymous from authenticated users, and review all tool execution call sites within this directory to prevent accidentally forwarding an anon_ user_id to tools.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
📚 Learning: 2026-03-31T14:22:26.566Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12622
File: autogpt_platform/backend/backend/copilot/tools/agent_search.py:223-236
Timestamp: 2026-03-31T14:22:26.566Z
Learning: In files under autogpt_platform/backend/backend/copilot/tools/, ensure agent graph enrichment uses the typed Pydantic model `backend.data.graph.Graph` for `AgentInfo.graph` (i.e., `Graph | None`), not `dict[str, Any]`. When enriching with graph data (e.g., `_enrich_agents_with_graph`), prefer calling `graph_db().get_graph(graph_id, version=None, user_id=user_id)` directly to retrieve the typed `Graph` object rather than routing through JSON conversions like `get_agent_as_json()` / `graph_to_json()`.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-16T16:35:40.236Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-31T15:37:38.626Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/manage_folders.py
  • autogpt_platform/backend/backend/copilot/tools/agent_output.py
  • autogpt_platform/backend/backend/copilot/tools/find_library_agent.py
  • autogpt_platform/backend/backend/copilot/tools/feature_requests.py
  • autogpt_platform/backend/backend/copilot/tools/search_docs.py
  • autogpt_platform/backend/backend/copilot/tools/get_doc_page.py
  • autogpt_platform/backend/backend/copilot/tools/find_block.py
  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/tools/agent_browser.py
  • autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py
  • autogpt_platform/backend/backend/copilot/tools/find_agent.py
  • autogpt_platform/backend/backend/copilot/tools/workspace_files.py
  • autogpt_platform/backend/backend/copilot/tools/base.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-02-27T15:59:00.370Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-02-27T15:59:00.370Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Applied to files:

  • autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py
📚 Learning: 2026-03-17T10:57:12.953Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-25T06:59:27.340Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-03-25T06:59:27.340Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Mock at boundaries by mocking where the symbol is used, not where it is defined. After refactoring, update mock targets to match new module paths

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-02-04T16:49:42.490Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-02-04T16:49:42.490Z
Learning: Applies to autogpt_platform/backend/**/test/**/*.py : Use snapshot testing with '--snapshot-update' flag in backend tests when output changes; always review with 'git diff'

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-25T06:59:27.340Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-03-25T06:59:27.340Z
Learning: Applies to autogpt_platform/backend/**/*_test.py : Use `AsyncMock` from `unittest.mock` for mocking async functions

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
📚 Learning: 2026-03-19T15:10:53.815Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12483
File: autogpt_platform/backend/backend/copilot/tools/test_dry_run.py:298-303
Timestamp: 2026-03-19T15:10:53.815Z
Learning: In Python unittest.mock, the correct patch target depends on whether an import is eager (module-level) or lazy (inside a function/branch):
- **Module-level import** (`from foo.bar import baz` at top of file): patch where the name is used, e.g. `patch("mymodule.baz")`.
- **Lazy import** (`from foo.bar import baz` inside a function/branch, executed at call time): patch the source module, e.g. `patch("foo.bar.baz")`, because the fresh `from ... import` at call time will look up the (now-patched) name in the source module's dict.
This pattern appears in `autogpt_platform/backend/backend/copilot/tools/helpers.py` where `simulate_block` is lazily imported inside the `if dry_run:` block, making `patch("backend.executor.simulator.simulate_block")` the correct target in tests.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py
🔇 Additional comments (13)
autogpt_platform/backend/backend/copilot/tools/base.py (1)

122-130: LGTM!

Clean implementation of the read_only property with a sensible default (False) and clear documentation explaining the MCP readOnlyHint annotation and parallel dispatch behavior. The opt-in pattern ensures new tools are safe by default.

autogpt_platform/backend/backend/copilot/tools/manage_folders.py (1)

198-200: LGTM!

ListFoldersTool correctly marked as read-only since it only performs list/read operations. The mutating tools (CreateFolderTool, UpdateFolderTool, MoveFolderTool, DeleteFolderTool, MoveAgentsToFolderTool) appropriately inherit the default False.

autogpt_platform/backend/backend/copilot/tools/search_docs.py (1)

56-58: LGTM!

Correctly marked as read-only — the tool only performs documentation searches with no side effects.

autogpt_platform/backend/backend/copilot/tools/get_doc_page.py (1)

44-46: LGTM!

Correctly marked as read-only — the tool only reads documentation file content with no side effects.

autogpt_platform/backend/backend/copilot/tools/feature_requests.py (1)

152-154: LGTM!

SearchFeatureRequestsTool correctly marked as read-only since it only queries Linear for existing issues. The CreateFeatureRequestTool appropriately inherits the default False as it performs mutations.

autogpt_platform/backend/backend/copilot/tools/agent_output.py (1)

163-165: LGTM!

Correctly marked as read-only — the tool only retrieves and views execution outputs without mutations. The wait_for_execution call is a polling/read operation, not a side effect.

autogpt_platform/backend/backend/copilot/tools/get_agent_building_guide.py (1)

55-57: LGTM!

Correctly marked as read-only — the tool only reads and returns the agent building guide content. The internal module-level caching is an implementation detail, not an external side effect.

autogpt_platform/backend/backend/copilot/tools/find_block.py (1)

81-83: LGTM!

Correctly marked as read-only — the tool only searches for and retrieves block metadata with no side effects.

autogpt_platform/backend/backend/copilot/tools/find_agent.py (1)

36-38: Looks good: FindAgentTool read-only classification is appropriate.

This aligns with the non-mutating behavior of the tool and fits the new parallel dispatch model.

autogpt_platform/backend/backend/copilot/tools/get_mcp_guide.py (1)

51-53: GetMCPGuideTool read-only flag is correct.

Good fit for SDK readOnlyHint-based dispatch.

autogpt_platform/backend/backend/copilot/tools/workspace_files.py (1)

445-447: ListWorkspaceFilesTool read-only flag looks correct.

This tool is query-only, so read_only=True is a good fit.

autogpt_platform/backend/backend/copilot/tools/find_library_agent.py (1)

39-41: Read-only classification for FindLibraryAgentTool looks good.

Consistent with this tool’s query-only behavior.

autogpt_platform/backend/backend/copilot/sdk/tool_adapter_test.py (1)

279-333: Nice coverage update for direct-execution semantics.

These tests clearly validate the post–pre-launch behavior and duplicate-execution guard.

Comment thread autogpt_platform/backend/backend/copilot/tools/agent_browser.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/tools/workspace_files.py Outdated
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

E2E Test Report

Date: 2026-03-31 | Branch: fix/copilot-duplicate-block-execution | SDK: claude_agent_sdk 0.1.45

# Scenario Result
1 CoPilot basic chat (no regression) PASS
2 CoPilot tool calls work (find_block x2) PASS
3 No duplicate tool execution in logs PASS
4 readOnlyHint parallel dispatch working PASS

Key Findings

Pre-launch mechanism fully removed: 0 log matches for pre_launch, arg mismatch, cancel_pending, or Preparing block.

readOnlyHint parallel dispatch confirmed: Two find_block calls dispatched 4ms apart (parallel), both completed at ~6s. Without readOnlyHint, these would run sequentially (~12s).

No duplicate execution: Each PostToolUse hook fired exactly once per tool call (2 calls, 2 hooks).

No regressions: CoPilot chat, tool calls, session persistence, and E2B sandbox all function correctly.

Full report: test-results/PR-12632-fix-backend-copilot-prevent-duplicate-block-execution/test-report.md

…dOnlyHint

Annotate all MCP tools (including side-effect tools like run_block,
bash_exec) with readOnlyHint=True so the SDK CLI dispatches all
concurrent tool calls in parallel.

E2E verified: 3 bash_exec(sleep 3) calls completed in ~3.3s total
(vs ~9s sequential), 3 find_block calls dispatched within 5ms.
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

E2E Test Report — Parallel Dispatch (all tools readOnlyHint=True)

Test: 3x find_block parallel dispatch

Tool inputs (dispatched within 5ms):
  19:23:16.384  find_block("send email")
  19:23:16.386  find_block("slack message")     +2ms
  19:23:16.389  find_block("http request")      +5ms

Tool outputs (completed near-simultaneously):
  19:23:22.348  output: 3,578 chars
  19:23:22.350  output: 3,753 chars              +2ms
  19:23:22.351  output: 3,855 chars              +3ms

Result: 3 tools in ~6s (vs ~18s sequential) — 3x speedup

Test: 3x bash_exec(sleep 3) parallel dispatch

Tool inputs (dispatched within 5ms):
  19:25:09.628  bash_exec
  19:25:09.631  bash_exec                        +3ms
  19:25:09.633  bash_exec                        +5ms

Tool completions:
  19:25:12.942  PostToolUse                      ~3.3s total
  19:25:12.999  PostToolUse                      +57ms
  19:25:13.002  PostToolUse                      +60ms

Result: 3x sleep 3 completed in 3.3s (vs 9s sequential) — 2.7x speedup, proving bash_exec (side-effect tool) also runs in parallel with readOnlyHint=True.

Negative checks

  • pre_launch|arg mismatch|cancel_pending: 0 matches in all logs
  • Preparing block duplicates: 0
  • No duplicate tool execution of any kind

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py`:
- Around line 537-542: The code currently marks every E2B file tool as read-only
by always passing _READONLY_ANNOTATION when creating the decorated tool; change
this to consult the _READ_ONLY_E2B_TOOLS set so that writeable tools (e.g.,
write_file, edit_file) are not annotated as read-only. Locate the tool creation
call that uses tool(name, desc, schema,
annotations=_READONLY_ANNOTATION)(_truncating(handler, name)) and modify it to
conditionally include _READONLY_ANNOTATION only when name is present in
_READ_ONLY_E2B_TOOLS (otherwise omit the annotations argument or pass None),
keeping the use of _truncating(handler, name) the same. Ensure the change
references the same symbols: tool, _truncating, handler, name,
_READ_ONLY_E2B_TOOLS, and _READONLY_ANNOTATION so write/edit tools can run
non-parallelizable operations safely.
- Around line 83-86: The _READ_ONLY_E2B_TOOLS frozenset is declared but never
used; update the E2B tool registration code that currently applies
_READONLY_ANNOTATION to all E2B tools unconditionally so it instead checks
membership in _READ_ONLY_E2B_TOOLS and only adds _READONLY_ANNOTATION for tools
whose names are in that set (or, if the intent is to mark all E2B tools
read-only, remove the unused _READ_ONLY_E2B_TOOLS constant). Specifically,
modify the E2B registration block that applies _READONLY_ANNOTATION to consult
_READ_ONLY_E2B_TOOLS when deciding to annotate each tool (referencing
_READ_ONLY_E2B_TOOLS and _READONLY_ANNOTATION to locate the change).
- Around line 450-453: The code currently applies _READONLY_ANNOTATION to every
entry from TOOL_REGISTRY; update the logic that builds the tool annotations (the
loop over TOOL_REGISTRY that adds _READONLY_ANNOTATION at/near where base_tool
is referenced) to only add that annotation when the tool's class property
BaseTool.read_only (accessed via base_tool.read_only) is truthy; in other words,
check base_tool.read_only before applying _READONLY_ANNOTATION so only tools
whose subclass overrides read_only=True (e.g., feature_requests, find_agent,
web_fetch, etc.) get readOnlyHint=True.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d4584f49-d0d8-458c-8952-1ab04ecaf796

📥 Commits

Reviewing files that changed from the base of the PR and between 23cdf29 and 8c45502.

📒 Files selected for processing (1)
  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: test (3.11)
  • GitHub Check: end-to-end tests
  • GitHub Check: get-changed-parts
  • GitHub Check: setup
  • GitHub Check: type-check (3.11)
  • GitHub Check: test (3.13)
  • GitHub Check: type-check (3.12)
  • GitHub Check: type-check (3.13)
  • GitHub Check: lint
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (typescript)
  • GitHub Check: check-overlaps
  • GitHub Check: Seer Code Review
🧰 Additional context used
📓 Path-based instructions (2)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Refer to @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks

autogpt_platform/backend/**/*.py: Import only at the top level; no local/inner imports except for lazy imports of heavy optional dependencies like openpyxl
Use absolute imports with from backend.module import ... for cross-package imports; single-dot relative imports (from .sibling import ...) are acceptable for sibling modules within the same package; avoid double-dot relative imports (from ..parent import ...)
Do not use duck typing with hasattr(), getattr(), or isinstance() for type dispatch; use typed interfaces, unions, or protocols instead
Use Pydantic models for structured data instead of dataclasses, namedtuples, or dicts
Do not use linter suppressors; no # type: ignore, # noqa, or # pyright: ignore comments — fix the underlying type/code issue instead
Use list comprehensions instead of manual loop-and-append patterns
Use early return guard clauses to avoid deep nesting
Use %s for deferred interpolation in debug log statements; use f-strings for readability in other log levels (e.g., logger.debug("Processing %s items", count), logger.info(f"Processing {count} items"))
Sanitize error paths using os.path.basename() in error messages to avoid leaking directory structure
Avoid TOCTOU (time-of-check-time-of-use) patterns; do not use check-then-act patterns for file access and credit charging operations
Use Redis pipelines with transaction=True for atomicity on multi-step Redis operations
Use max(0, value) guards for computed values that should never be negative
Keep files under ~300 lines; if a file grows beyond this, split by responsibility (extract h...

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/constants.py:9-12
Timestamp: 2026-03-10T08:39:22.025Z
Learning: In Significant-Gravitas/AutoGPT PR `#12356`, the `COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"` check in `create_auto_approval_record` (human_review.py) is intentional and safe. The `graph_exec_id` passed to this function comes from server-side `PendingHumanReview` DB records (not from user input); the API only accepts `node_exec_id` from users. Synthetic `copilot-*` IDs are only ever created server-side in `run_block.py`. The prefix skip avoids a DB lookup for a `AgentGraphExecution` record that legitimately does not exist for CoPilot sessions, while `user_id` scoping is enforced at the auth layer and on the resulting auto-approval record.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12439
File: autogpt_platform/backend/backend/blocks/autogpt_copilot.py:0-0
Timestamp: 2026-03-16T17:00:02.827Z
Learning: In autogpt_platform/backend/backend/blocks/autogpt_copilot.py, the recursion guard uses two module-level ContextVars: `_copilot_recursion_depth` (tracks current nesting depth) and `_copilot_recursion_limit` (stores the chain-wide ceiling). On the first invocation, `_copilot_recursion_limit` is set to `max_recursion_depth`; nested calls use `min(inherited_limit, max_recursion_depth)`, so they can only lower the cap, never raise it. The entry/exit logic is extracted into module-level helper functions. This is the approved pattern for preventing runaway sub-agent recursion in AutogptCopilotBlock (PR `#12439`, commits 348e9f8e2 and 3b70f61b1).
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:297-300
Timestamp: 2026-03-10T08:38:33.249Z
Learning: In autogpt_platform/backend/backend/copilot/tools/run_block.py, the auto-approval key for sensitive block HITL review uses graph_exec_id (copilot-session-{session_id}) + node_id (copilot-node-{block_id}). This is intentional: approving a block type within a CoPilot session auto-approves all future invocations of that same block type within the same session, mirroring how auto-approve works in normal graph execution. The user explicitly opts into this session-scoped behavior via an auto-approve toggle. Without the toggle (default), each individual invocation requires its own approval.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12356
File: autogpt_platform/backend/backend/copilot/tools/run_block.py:349-370
Timestamp: 2026-03-10T08:38:36.655Z
Learning: In the AutoGPT CoPilot HITL (Human-In-The-Loop) flow (`autogpt_platform/backend/backend/copilot/tools/run_block.py`), the review card presented to users sets `editable: false`, meaning reviewers cannot modify the input payload. Therefore, credentials resolved before `is_block_exec_need_review()` remain valid and do not need to be recomputed after the review step — the original `input_data` is unchanged through the review lifecycle.
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
Learnt from: Abhi1992002
Repo: Significant-Gravitas/AutoGPT PR: 12417
File: autogpt_platform/backend/backend/blocks/agent_mail/threads.py:80-102
Timestamp: 2026-03-16T16:30:20.657Z
Learning: In autogpt_platform/backend/backend/blocks/agent_mail/ blocks (and across the codebase), wrapping synchronous AgentMail SDK calls with `await asyncio.to_thread()` is NOT required. The block executor runs node execution in dedicated threads via `asyncio.run_coroutine_threadsafe` (manager.py lines ~745-752, ~1079), and the existing codebase pattern does not use `asyncio.to_thread` for SDK calls inside async `run()` methods.
📚 Learning: 2026-03-17T10:57:12.953Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/copilot/workflow_import/converter.py:0-0
Timestamp: 2026-03-17T10:57:12.953Z
Learning: In Significant-Gravitas/AutoGPT PR `#12440`, `autogpt_platform/backend/backend/copilot/workflow_import/converter.py` was fully rewritten (commit 732960e2d) to no longer make direct LLM/OpenAI API calls. The converter now builds a structured text prompt for AutoPilot/CoPilot instead. There is no `response.choices` access or any direct LLM client usage in this file. Do not flag `response.choices` access or LLM client initialization patterns as issues in this file.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📚 Learning: 2026-03-04T12:19:43.066Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12279
File: autogpt_platform/backend/backend/copilot/tools/base.py:184-188
Timestamp: 2026-03-04T12:19:43.066Z
Learning: In the AutoGPT Copilot backend (autogpt_platform/backend/backend/copilot/tools/), anonymous users always have user_id=None when passed to tool execution methods. The "anon_" prefix (e.g., "anon_123") is only used for PostHog/analytics tracking distinct_id and is never used as an actual user_id passed to tools. A simple truthiness check (`and user_id`) is sufficient to distinguish anonymous from authenticated users.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📚 Learning: 2026-03-16T16:35:40.236Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12440
File: autogpt_platform/backend/backend/api/features/workflow_import.py:54-63
Timestamp: 2026-03-16T16:35:40.236Z
Learning: Avoid using the word 'competitor' in public-facing identifiers and text. Use neutral naming for API paths, model names, function names, and UI text. Examples: rename 'CompetitorFormat' to 'SourcePlatform', 'convert_competitor_workflow' to 'convert_workflow', '/competitor-workflow' to '/workflow'. Apply this guideline to files under autogpt_platform/backend and autogpt_platform/frontend.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
📚 Learning: 2026-03-31T15:37:38.626Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12623
File: autogpt_platform/backend/backend/copilot/tools/agent_generator/fixer.py:37-47
Timestamp: 2026-03-31T15:37:38.626Z
Learning: When validating/constructing Anthropic API model IDs in Significant-Gravitas/AutoGPT, allow the hyphen-separated Claude Opus 4.6 model ID `claude-opus-4-6` (it corresponds to `LlmModel.CLAUDE_4_6_OPUS` in `autogpt_platform/backend/backend/blocks/llm.py`). Do NOT require the dot-separated form in Anthropic contexts. Only OpenRouter routing variants should use the dot separator (e.g., `anthropic/claude-opus-4.6`); `claude-opus-4-6` should be treated as correct when passed to Anthropic, and flagged only if it’s used in the OpenRouter path where the dot form is expected.

Applied to files:

  • autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
🔇 Additional comments (3)
autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py (3)

17-17: LGTM — new import for ToolAnnotations.

The import from mcp.types is appropriate for the new readOnlyHint annotation functionality.


444-445: LGTM — readonly annotation constant.

Extracting ToolAnnotations(readOnlyHint=True) as a module-level constant (_READONLY_ANNOTATION) avoids repeated allocations and clearly documents intent.


545-551: LGTM — Read tool marked as read-only.

The Read tool is correctly annotated with readOnlyHint=True since reading files has no side effects.

Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py Outdated
Comment thread autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

/review

@autogpt-pr-reviewer
Copy link
Copy Markdown

Queued a review for PR #12632 at 8c45502.

…d _READ_ONLY_E2B_TOOLS

Address review: since all tools now get readOnlyHint=True unconditionally,
the BaseTool.read_only property and 15 tool class overrides were dead code.
Remove them along with _READ_ONLY_E2B_TOOLS. Fix docstring to match.
@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

🤖 🔵 Nit (Round 2): Comment in service.py line 1296 says "parallel execution of read-only tools" but all tools now have readOnlyHint. Should say "parallel execution of all tools".

@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Mar 31, 2026

🤖 🔵 Nit (Round 3): _READONLY_ANNOTATION is misleading now that it's applied to ALL tools (including write/side-effect tools). Rename to _PARALLEL_ANNOTATION to reflect its actual purpose: enabling parallel dispatch.

…NNOTATION

Address review nits: rename misleading constant (all tools get it,
not just read-only), fix service.py comment to say "all tools".
@github-actions github-actions Bot added the conflicts Automatically applied to PRs with merge conflicts label Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Keep readOnlyHint test, add SDK_DISALLOWED_TOOLS tests from dev.
@github-actions github-actions Bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 1, 2026

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Apr 1, 2026

/review

@autogpt-pr-reviewer
Copy link
Copy Markdown

Queued a review for PR #12632 at cc4cb29.

majdyz added 2 commits April 1, 2026 06:32
Reproduce the three production bugs that the pre-launch speculative
execution mechanism caused, verifying the current direct-execution
handler is free of them:

Bug 1 (SECRT-2204): Duplicate execution on arg mismatch
  - test_single_execution_even_with_different_arg_representations
  - test_concurrent_calls_each_execute_once

Bug 2: FIFO desync when security hook denies a tool
  - test_skipped_call_does_not_affect_subsequent_calls
  - test_handler_has_no_shared_queue_state

Bug 3: Cancel race condition (task completes before cancel)
  - test_no_speculative_execution_before_handler_called
  - test_failed_execution_does_not_leave_orphaned_tasks
Each bug has two tests: one that reproduces the old buggy pre-launch
behavior (xfail — proves the bug exists) and one that verifies the
current clean handler is free of it (pass).

Bug 1 (SECRT-2204) — duplicate execution on arg mismatch
Bug 2 — FIFO desync when tool denied by security hook
Bug 3 — cancel race (task completes before cancel arrives)
@autogpt-pr-reviewer
Copy link
Copy Markdown

⚠️ Code review could not be completed

The review could not start because a setup step failed (e.g., dependency installation, repo clone). This is usually a repository configuration issue — check that your lock files are up to date and CI passes.

If this persists, please contact support with job ID b0b09272-85dc-4076-9b0d-c3ece66962eb.

Details: git clone failed (exit_code=1): Cloning into '/home/user/repo'... Updating files: 84% (4220/5021) Updating files: 85% (4268/5021) Updating files: 86% (4319/5021) Updating files: 87% (4369/5021) Updating files: 88% (4419/5021) Updating files: 89% (4469/5021) Updating files: 90% (4519/5021) Updating files: 91% (4570/5021) Updating files: 92% (4620/5021) Updating files: 93% (4670/5021) Updating files: 94% (4720/5021) Updating files: 95% (4770/5021) Updating files: 96% (4821/5021) Updat

@majdyz
Copy link
Copy Markdown
Contributor Author

majdyz commented Apr 1, 2026

/review

@autogpt-pr-reviewer
Copy link
Copy Markdown

Queued a review for PR #12632 at 0e2ae0d.

…e annotation test

Add docstring note explaining why side-effect tools use readOnlyHint=True
(deliberate override to avoid pre-launch duplicate-execution bug). Replace
trivial ToolAnnotations construction test with assertion against the actual
_PARALLEL_ANNOTATION constant.
Copy link
Copy Markdown

@autogpt-pr-reviewer autogpt-pr-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4/8 done (security, architect, performance, discussion). Testing in progress. Quality, product, UI-reviewer still queued. Waiting 3 more minutes.

Copy link
Copy Markdown
Contributor Author

@majdyz majdyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Report (Round 2) — Local Verification

Branch: fix/copilot-duplicate-block-execution | Worktree: AutoGPT3

Unit Tests

poetry run pytest backend/copilot/sdk/tool_adapter_test.py -x -q
30 passed, 3 xfailed in 20.56s

All 30 tests pass. The 3 xfail tests are regression tests that reproduce the three old bugs inline and confirm they fail as expected.

Code Review Checklist

Check Result
Pre-launch infrastructure fully removed from production code
readOnlyHint=True annotation applied to all tools (TOOL_REGISTRY, E2B, Read)
No dangling references to removed functions (pre_launch_tool_call, cancel_pending_tool_tasks, _tool_task_queues)
All 4 cancel_pending_tool_tasks() calls removed from service.py
service.py imports clean (no removed symbols)
Python import verification passes
tool_handler() executes exactly once per call (no speculative execution)
Regression tests prove all 3 bugs (arg mismatch, FIFO desync, cancel race)

Verdict

PASS — Clean removal of pre-launch mechanism (-578 lines), replaced with SDK-native parallel dispatch via readOnlyHint annotations (+105 lines). Comprehensive regression tests included.

@Abhi1992002 Abhi1992002 self-requested a review April 1, 2026 13:12
Copy link
Copy Markdown
Contributor

@Abhi1992002 Abhi1992002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

  • Codewise wise looks good
  • Tested locally as well - tools are running in parallel perfectly
Image

@github-project-automation github-project-automation Bot moved this from 🆕 Needs initial review to 👍🏼 Mergeable in AutoGPT development kanban Apr 1, 2026
@majdyz majdyz added this pull request to the merge queue Apr 1, 2026
Merged via the queue into dev with commit 8aae775 Apr 1, 2026
24 checks passed
@majdyz majdyz deleted the fix/copilot-duplicate-block-execution branch April 1, 2026 13:56
@github-project-automation github-project-automation Bot moved this from 👍🏼 Mergeable to ✅ Done in AutoGPT development kanban Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

platform/backend AutoGPT Platform - Back end size/l size/xl

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

2 participants