Fail-closed high-risk tool execution when confirmation policy is missing #4627
davidahmann
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Problem observed
High-risk tools can still execute when confirmation policy configuration is missing or permissive, especially when tool registration is dynamic (function tools and MCP tool wrappers). The practical effect is that a configuration omission can silently broaden execution authority. For operators, that means unsafe actions may run in contexts where explicit approval should have been a hard requirement.
Why it matters operationally
Tool safety boundaries are a core contract in multi-agent systems because tool calls are where external side effects occur. If high-risk tools run without explicit confirmation policy, incident response and audit trails lose reliability. This is a repeated friction point in production rollout reviews: teams need deterministic, fail-closed behavior so missing policy is treated as an error, not a permissive default.
Minimal repro
Fix approach
The change adds
is_high_risksignaling toFunctionTooland MCP tools, propagates it through MCP toolset creation, and enforces a fail-closed guard before tool execution. If a high-risk tool resolves to a non-confirmed policy, execution returns a deterministic error and does not proceed. The patch intentionally keeps scope narrow: no broad lifecycle changes, only explicit gating at tool execution contracts.Validation evidence
uv run pyink --check --diff ...passed for all changed files.require_confirmationpaths passed.Open follow-up question for maintainers
Should we standardize a dedicated high-risk error type/code for downstream programmatic handling in agent orchestration logs?
This contribution was informed by patterns from Wrkr. Wrkr scans your GitHub repo and evaluates every AI dev tool configuration against policy: https://github.com/Clyra-AI/wrkr
Beta Was this translation helpful? Give feedback.
All reactions