Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ This follows Pi's own split (see `PiAgentConfig`): the **persona** ("who the age
belongs in `append_system`, and **project conventions** belong in `AGENTS.md`. So the Agenta
persona is a forced `append_system`, while the Agenta base preamble plus the author's
instructions are the `AGENTS.md`. An author's own `system` / `append_system` (via
`AgentConfig.harness_options["pi_core"]`) still apply, layered after the forced persona.
`AgentConfig.harness_kwargs["pi_core"]`) still apply, layered after the forced persona.

## Selecting it

Expand Down
4 changes: 2 additions & 2 deletions docs/design/agent-workflows/documentation/adapters/pi.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,13 +71,13 @@ layer, and `SYSTEM` / `APPEND_SYSTEM` only change Pi's base persona. For almost
### How to set them

`SYSTEM` and `APPEND_SYSTEM` are Pi-specific, so they ride the neutral config's per-harness
escape hatch, `AgentConfig.harness_options`. It is a bag keyed by harness name; each Harness
escape hatch, `AgentConfig.harness_kwargs`. It is a bag keyed by harness name; each Harness
adapter reads only its own slice:

```python
AgentConfig(
instructions="Project: a SQL analytics tool. Run `make lint` before finishing.", # AGENTS.md
harness_options={
harness_kwargs={
"pi_core": {
"system": "You are a SQL expert. Only answer with queries.", # replaces base prompt
"append_system": "Always explain each query in one line.", # adds to base prompt
Expand Down
55 changes: 31 additions & 24 deletions docs/design/agent-workflows/documentation/agent-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ All file:line citations were verified against the code on 2026-06-23.
The playground renders a single composite `agent_config` control. The field list for that
control is not hardcoded in the frontend. It is fetched from the backend catalog type
`agent_config`, which the SDK defines once as `AgentConfigSchema`. The runtime then re-parses
the same payload into a permissive `AgentConfig` plus a `RunSelection`, resolves tools and
secrets server-side, and hands a final wire request to the Node runner.
the same payload into one permissive `AgentConfig` (the run-selection fields `harness`,
`sandbox`, and `permission_policy` live on it), resolves tools and secrets server-side, and
hands a final wire request to the Node runner.

## Three objects share the name "AgentConfig"

Expand All @@ -32,7 +33,7 @@ Playground form
→ AgentConfigControl (FE) reads schema.properties from the catalog type
→ GET /workflows/catalog/types/agent_config resolves x-ag-type-ref to the full schema
→ AgentConfigSchema (SDK) the strict schema, registered in CATALOG_TYPES
→ AgentConfig.from_params + RunSelection (SDK runtime) re-parse the saved payload
→ AgentConfig.from_params (SDK runtime) re-parse the saved payload (one config)
→ SessionConfig tools + secrets resolved server-side
→ AgentRunRequest (TS wire contract) the final shape the Node runner receives
```
Expand Down Expand Up @@ -137,7 +138,11 @@ class AgentConfig(BaseModel):
model: Optional[str] = None
tools: List[ToolConfig] = Field(default_factory=list)
mcp_servers: List[MCPServerConfig] = Field(default_factory=list)
harness_options: Dict[str, Dict[str, Any]] = Field(default_factory=dict)
harness_kwargs: Dict[str, Dict[str, Any]] = Field(default_factory=dict)
# the run-selection fields
harness: str = "pi_core"
sandbox: str = "local"
permission_policy: PermissionPolicy = "auto"
```

One correction to a common belief. This model is not `extra="allow"`. Its looseness comes
Expand All @@ -152,10 +157,12 @@ The genuinely loose object is the file-default dataclass at
`services/oss/src/agent/config.py:30`, which holds `tools: List[Any]`. That is the service's
built-in default, not user input.

Two fields the schema lists are not on this neutral config. `harness`, `sandbox`, and
`permission_policy` live on a separate `RunSelection` object
(`sdks/python/agenta/sdk/agents/dtos.py:364`). The SDK splits "what the agent is" from "where
and how it runs." The composite schema flattens both into one control for the playground.
The run-selection fields are on this neutral config too. `harness`, `sandbox`, and
`permission_policy` are plain fields on `AgentConfig` (in
`sdks/python/agenta/sdk/agents/dtos.py`). They used to live on a separate `RunSelection`
object; that object is retired, because there is one agent definition, not an agent plus a
sidecar selection. The composite schema and the neutral config now agree: both keep these
fields next to the rest of the agent.

Tool entries are strict even though the list is lenient. Each tool subclass is `extra="forbid"`
(`sdks/python/agenta/sdk/agents/tools/models.py`). `MCPServerConfig` is also `extra="forbid"`
Expand All @@ -168,19 +175,20 @@ The rich model picker is built only for the UI by `_model_catalog_type()`
## Layer 4: what the runtime actually reads

The Python `/invoke` handler is at `services/oss/src/agent/app.py`. It parses the request
into two objects (around line 72):
into one object:

```python
agent_config = AgentConfig.from_params(params, defaults=_default_agent_config())
selection = RunSelection.from_params(params)
```

It then resolves tools, MCP servers, and secrets server-side (`app.py`, lines 78 to 83),
bundles everything into a `SessionConfig` (`dtos.py:554`), picks a backend from the selection
(`select_backend`, `app.py:49`), and runs one turn through a harness.
That single parse covers everything, including the run-selection fields. The handler then
resolves tools, MCP servers, and secrets server-side, bundles everything into a
`SessionConfig`, picks a backend from `agent_config.sandbox` (`select_backend`,
`services/oss/src/agent/app.py`), and runs one turn through a harness chosen from
`agent_config.harness`.

`sandbox` is deliberately absent from `SessionConfig`. It is a backend concern. The handler
passes it to `SandboxAgentBackend(sandbox=...)` instead (`app.py:56`).
reads `agent_config.sandbox` and passes it to `SandboxAgentBackend(sandbox=...)` instead.

The final wire shape the Node runner receives is `AgentRunRequest` in
`services/agent/src/protocol.ts` (around line 185). That is the true wired surface:
Expand All @@ -199,9 +207,9 @@ Legend: (a) catalog/schema, (b) SDK neutral config, (c) runtime.
| skills | no | no | wired but forced only | Not author-settable. Only the Agenta harness injects forced skills. See below. |
| persona | no | no | wired but forced only | Not a config field. The Agenta harness hardcodes an append-system preamble. See below. |
| agents_md | yes, `agents_md: str` | yes, as `instructions` | wired to `agentsMd` | The schema names it `agents_md`. The neutral config names it `instructions`. |
| harness | yes, enum | no, on `RunSelection` | wired, picks the harness class | Enum-enforced. The runtime validates via `make_harness`. |
| sandbox | yes, enum | no, on `RunSelection` | wired to the backend, absent from `SessionConfig` | Backend concern, not agent identity. |
| permission_policy | yes, enum | no, on `RunSelection` | wired to `SessionConfig` | Only the Claude harness reads it. Pi ignores it, so it is decorative for `pi_core` and `pi_agenta`. |
| harness | yes, enum | yes, on `AgentConfig` | wired, picks the harness class | Enum-enforced. The runtime validates via `make_harness`. |
| sandbox | yes, enum | yes, on `AgentConfig` | wired to the backend, absent from `SessionConfig` | Backend concern, not agent identity. |
| permission_policy | yes, enum | yes, on `AgentConfig` | wired to `SessionConfig` | Only the Claude harness reads it. Pi ignores it, so it is decorative for `pi_core` and `pi_agenta`. |
Comment on lines +210 to +212

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Reword the permission_policy row to match the new ownership.

This still says the policy is wired to SessionConfig, but the flattened contract keeps it on AgentConfig and the runtime consumes it through harness wiring. Please update the wording so ownership stays on the agent config. Per the PR objective, permissionPolicy still rides through wire_tools().


## Notable gaps and quirks

Expand All @@ -214,11 +222,10 @@ Claude harnesses get no forced skills or persona.
Per-harness divergence is real. `permission_policy` is wired only for Claude. Builtin tool
names are dropped for Claude with a warning, because builtins are Pi-only. Skills and persona
are Agenta-only. Pi's `system` and `append_system` overrides come through the
`harness_options` escape hatch on the neutral config, which is itself absent from the schema.
`harness_kwargs` escape hatch on the neutral config, which is itself absent from the schema.

The schema is the only place where harness, sandbox, and permission policy sit next to the
agent definition. The SDK keeps them apart. The composite schema re-flattens them so the
playground can show one control.
Harness, sandbox, and permission policy sit next to the agent definition in both the schema
and the neutral `AgentConfig`. The two agree on one control.

## A concrete example config

Expand All @@ -245,9 +252,9 @@ This is what the playground saves and the runtime reads:
}
```

With this config, the runtime reads `agents_md`, `model`, `tools`, and `mcp_servers` through
the neutral `AgentConfig`, reads `harness`, `sandbox`, and `permission_policy` through
`RunSelection`, resolves the tools and MCP servers server-side, and runs one turn on the Pi
With this config, the runtime reads `agents_md`, `model`, `tools`, `mcp_servers`, and the
run-selection fields `harness`, `sandbox`, and `permission_policy` through the one neutral
`AgentConfig`, resolves the tools and MCP servers server-side, and runs one turn on the Pi
harness in a local sandbox. The `permission_policy` value is ignored because the harness is
Pi, not Claude.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ wire shape JSON-friendly. File bytes can be base64 encoded when plain text is no
| --- | --- | --- |
| Generic agent identity | `AGENTS.md`, skills, tool references, template metadata | Intended long-term template surface. Partly represented today by `agents_md` and tool config. |
| Harness-specific config | Harness id, model, harness option bags, permission policy | Present today. Permissions are not generic yet. |
| Runtime infrastructure | Local versus Daytona, runner sidecar URL, filesystem isolation, secret channels | Present as a POC selection in `RunSelection`, but should not become durable agent identity by default. |
| Runtime infrastructure | Local versus Daytona, runner sidecar URL, filesystem isolation, secret channels | Present today as run-selection fields on `AgentConfig` (one agent config), but should not become durable agent identity by default. |

The current code still accepts `sandbox` in request config. That is useful for the POC and
tests, but the long-term template should not require users to encode where the platform is
Expand Down
9 changes: 5 additions & 4 deletions docs/design/agent-workflows/documentation/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Agenta already runs prompt workflows that call a model once and return one answe
workflow runs a coding harness instead. The harness reads instructions, calls a model, calls
tools, observes the results, and loops until it has an answer.

The runtime keeps two run choices configurable
(`sdks/python/agenta/sdk/agents/dtos.py:364`, `RunSelection`):
The runtime keeps two run choices configurable as fields on `AgentConfig`
(`sdks/python/agenta/sdk/agents/dtos.py`):

- **Harness:** which agent runs. Supported values are `pi_core`, `claude`, and experimental
`pi_agenta`. Default `pi_core`. `pi_core` and `pi_agenta` both drive the `pi` ACP agent;
Expand Down Expand Up @@ -107,8 +107,9 @@ sandbox-agent local and Daytona (`projects/qa/findings.md`, F-002).

Batch `/invoke` follows this path:

1. The workflow route calls `_agent` in `services/oss/src/agent/app.py:63`.
2. `_agent` parses `AgentConfig` and `RunSelection` from request parameters.
1. The workflow route calls `_agent` in `services/oss/src/agent/app.py`.
2. `_agent` parses one `AgentConfig` from request parameters; it carries the run-selection
fields `harness`, `sandbox`, and `permission_policy`.
3. The service resolves three things independently: tools, MCP servers, and provider-key
secrets. MCP resolution is gated by `AGENTA_AGENT_ENABLE_MCP`
(`services/oss/src/agent/tools/resolver.py:23`, off by default).
Expand Down
2 changes: 1 addition & 1 deletion docs/design/agent-workflows/documentation/ground-truth.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ this page and the referenced code as the source of truth.
| Agent service handler | `services/oss/src/agent/app.py` | Parses agent config, resolves secrets and tools, builds `SandboxAgentBackend`, runs batch or streaming turns. |
| Agent route wiring | `sdks/python/agenta/sdk/decorators/routing.py` | Registers `/invoke`, `/inspect`, and agent-only `/messages`. |
| Browser protocol adapter | `sdks/python/agenta/sdk/agents/adapters/vercel/` | Converts Vercel `UIMessage` input and emits Vercel UI Message Stream parts. |
| SDK runtime DTOs | `sdks/python/agenta/sdk/agents/dtos.py` | Defines `AgentConfig`, `RunSelection`, `SessionConfig`, messages, events, capabilities, and harness configs. |
| SDK runtime DTOs | `sdks/python/agenta/sdk/agents/dtos.py` | Defines `AgentConfig` (incl. the run-selection fields), `SessionConfig`, messages, events, capabilities, and harness configs. |
| SDK runtime ports | `sdks/python/agenta/sdk/agents/interfaces.py` | Defines `Backend`, `Environment`, `Sandbox`, `Session`, and `Harness`. |
| Backend adapters | `sdks/python/agenta/sdk/agents/adapters/sandbox_agent.py`, `local.py` | Implement the sandbox-agent backend. `LocalBackend` is a stub. |
| Harness adapters | `sdks/python/agenta/sdk/agents/adapters/harnesses.py` | Maps neutral session config into Pi, Claude, and Agenta harness-specific config. |
Expand Down
19 changes: 10 additions & 9 deletions docs/design/agent-workflows/documentation/ports-and-adapters.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The SDK runtime lives under `sdks/python/agenta/sdk/agents/`.

| Layer | Files | Role |
| --- | --- | --- |
| DTOs | `dtos.py` | `AgentConfig`, `RunSelection`, `SessionConfig`, messages, events, capabilities, and harness-specific config models. |
| DTOs | `dtos.py` | `AgentConfig` (incl. the run-selection fields), `SessionConfig`, messages, events, capabilities, and harness-specific config models. |
| Ports | `interfaces.py` | `Backend`, `Environment`, `Sandbox`, `Session`, `Harness`. |
| Backend adapters | `adapters/sandbox_agent.py`, `adapters/local.py` | Engines that can run a harness. |
| Harness adapters | `adapters/harnesses.py` | Per-harness mapping from neutral session config to harness-specific config. |
Expand Down Expand Up @@ -54,7 +54,7 @@ turn.
Current harnesses:

- `PiHarness` keeps built-in tool names, resolved tool specs, Pi prompt overrides (`system`
and `append_system` from `harness_options.pi`), and Pi native tool delivery.
and `append_system` from the `pi_core` key of `harness_kwargs`), and Pi native tool delivery.
- `ClaudeHarness` drops Pi built-ins, carries MCP-delivered specs, and carries the
permission policy.
- `AgentaHarness` (harness value `pi_agenta`) is Pi with forced Agenta policy layered on top:
Expand Down Expand Up @@ -86,9 +86,10 @@ session representation and storage size.
## Config Ownership

`AgentConfig` describes the agent itself: instructions, model, tool references, MCP server
config, and per-harness option bags. It does not choose a backend.

`RunSelection` describes runtime choices: harness, sandbox, and permission policy.
config, and per-harness option bags. It also carries the run-selection fields `harness`,
`sandbox`, and `permission_policy`; there is one agent config, not a config plus a separate
`RunSelection` object. The handler reads `sandbox` to choose a backend, but the backend choice
itself is not stored on the config.

This is the current POC shape. The long-term split should be stricter:

Expand All @@ -98,9 +99,9 @@ This is the current POC shape. The long-term split should be stricter:
- Runtime infrastructure: local versus Daytona, runner sidecar URL, filesystem isolation,
and secret channels.

Sandbox is currently selectable through `RunSelection` so the POC can exercise local and
Daytona paths. It should not become durable agent template identity unless product
requirements explicitly need portable per-template runtime selection.
Sandbox is currently selectable through the `sandbox` field on `AgentConfig` so the POC can
exercise local and Daytona paths. It should not become durable agent template identity unless
product requirements explicitly need portable per-template runtime selection.

`SessionConfig` describes one run: the neutral agent config plus resolved secrets, resolved
tools, resolved MCP servers, trace context, and the session id.
Expand All @@ -109,7 +110,7 @@ tools, resolved MCP servers, trace context, and the session id.

`services/oss/src/agent/app.py` is a thin consumer of the SDK ports:

1. Parse `AgentConfig` and `RunSelection`.
1. Parse one `AgentConfig` (it carries the run-selection fields too).
2. Resolve provider secrets.
3. Resolve tools and, when enabled, MCP servers.
4. Build `SessionConfig`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,16 @@ the order it runs in is itself a contract.

The handler (`_agent` in `app.py`) takes the workflow envelope's pieces:

- `parameters`: carries the agent config under `agent` and the run selection (`harness`,
`sandbox`, `permission_policy`) in the same object.
- `parameters`: carries the agent config under `agent`. The run-selection fields (`harness`,
`sandbox`, `permission_policy`) live on that same `agent` object.
- `messages` or `inputs.messages`: the turn history (it checks `messages` first).
- `stream`: batch versus streaming.
- `session_id`: the external conversation id.

## What it does, in order

1. Parse config and selection: `AgentConfig.from_params(params, defaults=...)` and
`RunSelection.from_params(params)`.
1. Parse the config: `AgentConfig.from_params(params, defaults=...)`. One parse covers
everything, including the run-selection fields (`harness`, `sandbox`, `permission_policy`).
2. Convert the request messages to neutral `Message[]`.
3. Resolve tools into builtin names, runnable specs, and a tool callback.
4. Resolve MCP servers.
Expand Down Expand Up @@ -56,8 +56,8 @@ the instrumented handler and merges the registered interface (the passed `schema

## Watch for when changing

- **Where config lives.** Agent config and run selection share `parameters`. Moving either
breaks the form and the playground request builder.
- **Where config lives.** The agent config, including its run-selection fields, rides
`parameters.agent`. Moving it breaks the form and the playground request builder.
- **Default application.** The handler merges request params over a default agent config.
Changing the merge changes what an empty form runs.
- **Resolution order.** Provider and mode gate before resolution; deployment gates after.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Each adapter implements `_to_harness_config(...)` and emits a different `/run` w
tool use.
- **`ClaudeHarness`** delivers tools over MCP, not natively, and has no Pi built-ins (it warns
if any are set). It carries `permission_policy` and renders `.claude/settings.json` from
`harness_options` and the sandbox permission, shipped as `harnessFiles`. It carries inline
`harness_kwargs` and the sandbox permission, shipped as `harnessFiles`. It carries inline
skill packages on the wire like the others; the runner materializes them under
`.claude/skills` in the session cwd, matching Claude's project-local skill layout.
- **`AgentaHarness`** runs on the same Pi engine but forces Agenta's opinion: it composes the
Expand All @@ -31,7 +31,7 @@ The wire shapes, side by side:
|---|---|---|---|
| built-in tools | yes | no | forced set |
| custom tools | native | over MCP | native |
| prompt overrides | `system`/`append_system` | none (reads `harness_options`) | forced `append_system` + author `system` |
| prompt overrides | `system`/`append_system` | none (reads `harness_kwargs`) | forced `append_system` + author `system` |
| permission policy | dropped | carried | dropped |
| inline skills | yes (agent-dir scope) | yes (materialized to `.claude/skills`) | yes (agent-dir scope) |
| harness files | none | `.claude/settings.json` | none |
Expand All @@ -52,5 +52,5 @@ The wire shapes, side by side:
materializes them under `.claude/skills`. (An earlier revision suppressed Claude's
`wire_skills()` to `{}`; that override is gone, and `test_claude_carries_skills_for_project_local_materialization`
now pins the carry-on-wire behavior.)
- **Harness options.** The `harness_options` bag is keyed by harness; each adapter reads only
- **Harness options.** The `harness_kwargs` bag is keyed by harness; each adapter reads only
its own slice.
Loading
Loading