Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/design/agent-workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ from in-flight project notes and historical archaeology.
what is still missing.
2. [Architecture](documentation/architecture.md): the service, agent runner sidecar,
harnesses, and sandboxes.
3. [Protocol](documentation/protocol.md): `/invoke`, `/messages`, `/load-session`, and the
3. [Protocol](documentation/protocol.md): `/invoke`, `/messages`, and the
runner `/run` wire contract.
4. [Ports and Adapters](documentation/ports-and-adapters.md): the SDK runtime ports,
backend adapters, harness adapters, and browser protocol adapter.
Expand Down
11 changes: 5 additions & 6 deletions docs/design/agent-workflows/documentation/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@ The runtime keeps two run choices configurable
`local`.

The platform exposes the agent through normal workflow routing. `/invoke` is the batch
contract. Agent routes also register `/messages` and `/load-session` for the browser chat
protocol.
contract. Agent routes also register `/messages` for the browser chat protocol.

## Runtime Shape

Expand Down Expand Up @@ -141,8 +140,8 @@ Agent `/messages` follows the same runtime path after a browser-protocol adapter
5. The Vercel adapter converts live `AgentEvent` objects into Vercel UI Message Stream parts
and the routing layer frames them as SSE.

`/load-session` is registered for agent routes, but no durable store is wired. It returns an
empty message list. See [Sessions](sessions.md).
The runtime is cold: the client sends the full conversation on each turn and the server does
not persist history. See [Sessions](sessions.md).

## Lifecycle

Expand Down Expand Up @@ -215,8 +214,8 @@ rolls run usage back onto it.
## Gaps

- `LocalBackend` is a public adapter shape but does not run anything yet.
- No durable session store is wired. `/load-session` returns empty history and completed turns
are not persisted. See [Sessions](sessions.md).
- No durable session store is wired. The runtime is cold and completed turns are not
persisted. See [Sessions](sessions.md).
- `AgentaHarness` uses placeholder preamble, persona, and skill content.
- The agent is registered as a custom workflow handler, not as a first-class builtin URI such
as `agenta:builtin:agent:v0`. The builtin interface exists in the SDK, but the handler is
Expand Down
18 changes: 9 additions & 9 deletions docs/design/agent-workflows/documentation/ground-truth.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ this page and the referenced code as the source of truth.
| Area | Files | Active-stack role |
| --- | --- | --- |
| Agent service handler | `services/oss/src/agent/app.py` | Parses agent config, resolves secrets and tools, builds `SandboxAgentBackend`, runs batch or streaming turns. |
| Agent route wiring | `sdks/python/agenta/sdk/decorators/routing.py` | Registers `/invoke`, `/inspect`, and agent-only `/messages` plus `/load-session`. |
| Agent route wiring | `sdks/python/agenta/sdk/decorators/routing.py` | Registers `/invoke`, `/inspect`, and agent-only `/messages`. |
| Browser protocol adapter | `sdks/python/agenta/sdk/agents/adapters/vercel/` | Converts Vercel `UIMessage` input and emits Vercel UI Message Stream parts. |
| SDK runtime DTOs | `sdks/python/agenta/sdk/agents/dtos.py` | Defines `AgentConfig`, `RunSelection`, `SessionConfig`, messages, events, capabilities, and harness configs. |
| SDK runtime ports | `sdks/python/agenta/sdk/agents/interfaces.py` | Defines `Backend`, `Environment`, `Sandbox`, `Session`, `Harness`, `SessionStore`, and `NoopSessionStore`. |
| SDK runtime ports | `sdks/python/agenta/sdk/agents/interfaces.py` | Defines `Backend`, `Environment`, `Sandbox`, `Session`, and `Harness`. |
| Backend adapters | `sdks/python/agenta/sdk/agents/adapters/sandbox_agent.py`, `local.py` | Implement the sandbox-agent backend. `LocalBackend` is a stub. |
| Harness adapters | `sdks/python/agenta/sdk/agents/adapters/harnesses.py` | Maps neutral session config into Pi, Claude, and Agenta harness-specific config. |
| Runner wire | `sdks/python/agenta/sdk/agents/utils/wire.py`, `services/agent/src/protocol.ts` | Keeps the Python and TypeScript `/run` payloads in sync. |
Expand All @@ -27,7 +27,7 @@ this page and the referenced code as the source of truth.
- The service exposes an agent workflow handler through `ag.create_app`, `ag.workflow`, and
`ag.route`.
- `/invoke` runs one cold turn and returns the final assistant message.
- Agent routes register `/messages` and `/load-session` when `flags={"is_agent": True}`.
- Agent routes register `/messages` when `flags={"is_agent": True}`.
- `/messages` validates or mints `session_id`, folds Vercel `UIMessage` input into neutral
runtime messages, and supports JSON or Vercel SSE based on `Accept`.
- Streaming runs over a runner NDJSON stream internally. The browser edge projects those
Expand Down Expand Up @@ -55,9 +55,8 @@ this page and the referenced code as the source of truth.
## Not Implemented

- `LocalBackend` does not run Pi or Claude. It raises `NotImplementedError`.
- `SessionStore` has no production adapter. The default `NoopSessionStore` returns empty
history and discards writes.
- Completed `/messages` turns are not persisted to a session store by default.
- There is no durable session store. The runtime is cold and completed `/messages` turns are
not persisted; there is no history-load endpoint.
- Harness session snapshots, such as sandbox-agent/ACP state save/load around cleanup/setup, are
not represented by a production port yet.
- Warm daemon sessions, ACP `session/load`, and session fork are not wired.
Expand All @@ -78,8 +77,9 @@ this page and the referenced code as the source of truth.

- [SDK Local Tools](../projects/sdk-local-tools/) is a planned and partly implemented
workspace for standalone SDK tool resolution. It remains blocked on `LocalBackend`.
- Durable server-owned sessions need a real `SessionStore`, a write path from completed
turns, ownership checks, and a decision on platform versus local storage.
- Durable server-owned sessions need a session store with a port and adapter, a write path
from completed turns, a history-load endpoint, ownership checks, and a decision on platform
versus local storage.
- Stateful session resume needs research into sandbox-agent/ACP session representation and a
future save/load snapshot interface separate from chat history.
- Trigger integration needs a provider port, a Compose.io adapter, Agenta-owned trigger
Expand All @@ -89,7 +89,7 @@ this page and the referenced code as the source of truth.

## Verification Pointers

- `/messages` and `/load-session` routing tests live in
- `/messages` routing tests live in

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Include the route-registration test here too.

sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py covers the endpoint flow, but sdks/python/oss/tests/pytest/utils/test_routing.py also asserts /messages in the OpenAPI schema. Listing both keeps the verification path complete. As per the provided test snippets, test_routing.py still guards /messages registration.

Suggested edit
-- `/messages` routing tests live in
-  `sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py`.
+- `/messages` routing tests live in
+  `sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py`
+  and `sdks/python/oss/tests/pytest/utils/test_routing.py`.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- `/messages` routing tests live in
- `/messages` routing tests live in
`sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py`
and `sdks/python/oss/tests/pytest/utils/test_routing.py`.

`sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py`.
- Agent service handler tests live in `services/oss/tests/pytest/unit/agent/`.
- Wire-contract tests live in `sdks/python/oss/tests/pytest/unit/agents/test_wire_contract.py`.
Expand Down
24 changes: 11 additions & 13 deletions docs/design/agent-workflows/documentation/ports-and-adapters.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The SDK runtime lives under `sdks/python/agenta/sdk/agents/`.
| Layer | Files | Role |
| --- | --- | --- |
| DTOs | `dtos.py` | `AgentConfig`, `RunSelection`, `SessionConfig`, messages, events, capabilities, and harness-specific config models. |
| Ports | `interfaces.py` | `Backend`, `Environment`, `Sandbox`, `Session`, `Harness`, `SessionStore`. |
| Ports | `interfaces.py` | `Backend`, `Environment`, `Sandbox`, `Session`, `Harness`. |
| Backend adapters | `adapters/sandbox_agent.py`, `adapters/local.py` | Engines that can run a harness. |
| Harness adapters | `adapters/harnesses.py` | Per-harness mapping from neutral session config to harness-specific config. |
| Browser adapter | `adapters/vercel/` | Vercel `UIMessage` input and Vercel UI Message Stream output. |
Expand Down Expand Up @@ -72,17 +72,16 @@ wrapper around one `/run` call. It exposes both:
`AgentRun` yields live `AgentEvent` objects and exposes the terminal `AgentResult` after
the stream drains.

### SessionStore
### Session persistence

`SessionStore` is the durable-history port. It has `load` and `save_turn`. The only default
adapter is `NoopSessionStore`, which returns no messages and discards writes.
There is no durable-history port. The runtime is cold: the client sends the full
conversation on every turn. Server-owned session history is not implemented, so completed
turns are not persisted and there is no load path.

This is intentional scaffolding. Server-owned session history is not implemented yet.

A separate future port is still needed for harness session snapshots. Durable message
history can reload a transcript, but it cannot necessarily restore sandbox-agent/ACP session state,
tool state, or setup artifacts. That future port should be designed after we inspect the
actual session representation and storage size.
A future port for harness session snapshots is still open. Durable message history could
reload a transcript, but it cannot necessarily restore sandbox-agent/ACP session state,
tool state, or setup artifacts. That port should be designed after we inspect the actual
session representation and storage size.

## Config Ownership

Expand Down Expand Up @@ -138,7 +137,6 @@ It owns:
- `session_id` validation and minting.
- `/messages` stream negotiation.
- Vercel stream-part encoding.
- `/load-session` over `SessionStore`.

This keeps Vercel-specific names out of the runtime ports.

Expand All @@ -156,8 +154,8 @@ result fields should update both sides and the wire tests in the same PR.
## Known Weak Points

- `LocalBackend` appears in public exports but is not usable yet.
- `SessionStore` has no production adapter and the current runtime does not call
`save_turn` after completed `/messages` turns.
- Session history is not persisted: the runtime is cold and completed `/messages` turns are
not stored.
- `AgentaHarness` policy content is placeholder product copy.
- MCP server resolution is disabled unless `AGENTA_AGENT_ENABLE_MCP` is truthy.
- The code still has historical WP labels in some comments. Those labels should not guide new
Expand Down
29 changes: 3 additions & 26 deletions docs/design/agent-workflows/documentation/protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ The agent workflow has two public HTTP surfaces and one internal runner surface.
| --- | --- | --- | --- |
| `POST /invoke` | Implemented | Generic workflow clients | Batch workflow call. Returns one final response. |
| `POST /messages` | Implemented | Browser chat clients | Agent chat call. Accepts Vercel `UIMessage` input and can stream Vercel SSE. |
| `POST /load-session` | Shell implemented | Browser chat clients | Loads saved session history. Returns empty history by default because storage is not wired. |
| `POST /run` | Implemented internal wire | Python SDK backend adapters | Runs one agent turn through the TypeScript runner sidecar or CLI. |

## `/invoke`
Expand Down Expand Up @@ -52,9 +51,9 @@ Important details:

- `session_id` is optional. The server mints one when it is absent.
- Client-supplied ids must match `^[A-Za-z0-9._:-]{1,128}$`.
- The intended storage behavior is create-or-resume: a known id resumes, and a valid unknown
id creates a new session with that id. This is not observable yet because durable storage
is not implemented.
- The runtime is cold: the client sends the full conversation in `data.messages` on every
turn. The server does not persist session history, so the id only tags the turn (for
tracing) and is echoed back.
- `data.messages` is a Vercel `UIMessage[]`. The adapter folds it into neutral runtime
`Message` objects before invoking the workflow.
- `data.stream` is not a stored config value. The route sets it from the `Accept` header.
Expand Down Expand Up @@ -91,28 +90,6 @@ The runtime emits neutral `AgentEvent` objects. The Vercel adapter maps them to
The first `start` part carries `messageMetadata.sessionId`. The SSE stream ends with
`data: [DONE]`.

## `/load-session`

`/load-session` accepts:

```json
{ "session_id": "sess_abc" }
```

It returns:

```json
{ "session_id": "sess_abc", "messages": [] }
```

The route is real, but the default store is `NoopSessionStore`. Until a production
`SessionStore` is injected and completed turns call `save_turn`, the endpoint only confirms
the contract.

Clients that already know a session id should call this endpoint before the first chat turn
if they need history on screen. The normal chat path should not require a separate explicit
create-session call.

## `/run`

`/run` is the internal Python-to-TypeScript boundary. The Python side serializes it in
Expand Down
37 changes: 16 additions & 21 deletions docs/design/agent-workflows/documentation/sessions.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,18 +58,12 @@ them as SSE.
So the browser can see text, reasoning, tool calls, tool results, data parts, files, errors,
and finish metadata as they happen. This is live delivery, not a warm or persisted session.

### `/load-session`
### No history load path

The route exists and calls a `SessionStore` port. The default store is `NoopSessionStore`
(`sdks/python/agenta/sdk/agents/interfaces.py:112`), and the route registration passes no
other store (`sdks/python/agenta/sdk/decorators/routing.py:515`). So it always returns an
empty list:

```json
{ "session_id": "sess_abc", "messages": [] }
```

That makes the protocol testable. It does not restore history.
There is no endpoint that returns a session's stored history, because the server does not
store it. A client that needs prior turns on screen must keep its own history and resend it
on each turn. A history-load contract is part of the durable-store work below, not something
shipped today.

## Intended (not implemented)

Expand All @@ -89,17 +83,17 @@ There should not be a required `create-session` endpoint for the normal chat pat
implicit creation should cover pre-message operations too. For example, a file upload before
the first typed message can create a session and return the id later chat turns use.

A client that already knows a session id and needs to render history should call
`/load-session` before the first message.
A client that already knows a session id and needs to render history would, once a store
exists, fetch that history before the first message.

### A real session store

To make sessions real, the platform needs:

- A production `SessionStore` implementation, injected where `NoopSessionStore` is today.
- A call to `save_turn` after each completed `/messages` turn.
- A durable session store, plus a port and adapter to reach it.
- A write path that persists each completed `/messages` turn.
- Ownership checks keyed by project and caller.
- A load path that returns persisted Vercel `UIMessage` history.
- A load path (and a load endpoint) that returns persisted Vercel `UIMessage` history.
- A policy for failed, cancelled, and partially streamed turns.

Until that lands, clients must keep sending full history.
Expand All @@ -115,11 +109,12 @@ Examples of state that may not be recoverable from messages alone:
- Tool or harness state created during setup.
- Filesystem or process metadata needed to resume a warm session after a cold restart.

This interface is not designed yet. The `SessionStore` port covers message history only; a
snapshot port would be a separate addition. It likely needs explicit `save_session` and
`load_session` semantics around cleanup and setup, plus a storage decision after we measure the
size and shape of sandbox-agent/ACP session data. Small JSON blobs may fit in Postgres. Large
opaque blobs may need object storage. Retention should be short by default, measured in days.
This interface is not designed yet. A durable message-history store would cover transcripts
only; a snapshot port would be a separate addition. It likely needs explicit `save_session`
and `load_session` semantics around cleanup and setup, plus a storage decision after we
measure the size and shape of sandbox-agent/ACP session data. Small JSON blobs may fit in
Postgres. Large opaque blobs may need object storage. Retention should be short by default,
measured in days.

### Warm sessions

Expand Down
5 changes: 2 additions & 3 deletions docs/design/agent-workflows/interfaces/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ page. `Status` is read from each page's prose: **stable** (wired and unlikely to
| [Service to vault and tool providers](cross-service/service-to-vault-and-tool-providers.md) | cross-service (external) | `agent/app.py`, `platform/{resolve,connections}.py`, `agents/capabilities.py`, `tools/router.py` | stable | `unit/agents/connections/`, `unit/agents/platform/`, `unit/agents/tools/` |
| [Agent service handler](in-service/agent-service-handler.md) | in-service | `services/oss/src/agent/app.py` | stable | `services/oss/tests/pytest/unit/agent/` |
| [Neutral runtime DTOs](in-service/neutral-runtime-dtos.md) | in-service | `agents/dtos.py` | stable | `unit/agents/test_dtos_*.py` |
| [Runtime ports](in-service/runtime-ports.md) | in-service | `agents/interfaces.py` | evolving (`SessionStore` noop, `LocalBackend` stub) | `unit/agents/test_environment_lifecycle.py`, `test_harness_adapters.py` |
| [Runtime ports](in-service/runtime-ports.md) | in-service | `agents/interfaces.py` | evolving (`LocalBackend` stub) | `unit/agents/test_environment_lifecycle.py`, `test_harness_adapters.py` |
| [Backend adapter](in-service/backend-adapter.md) | in-service | `agents/adapters/sandbox_agent.py` | stable | `unit/agents/test_runner_adapter_config.py`, `test_environment_lifecycle.py` |
| [Harness adapters](in-service/harness-adapters.md) | in-service | `agents/adapters/harnesses.py`, `agents/dtos.py` | stable | `unit/agents/test_harness_adapters.py`, `test_dtos_harness_configs.py` |
| [Browser protocol adapter](in-service/browser-protocol-adapter.md) | in-service | `agents/adapters/vercel/{routing,messages,stream,sse}.py` | stable | `unit/agents/test_ui_messages.py`, `utils/test_messages_endpoint.py` |
Expand All @@ -66,8 +66,7 @@ page. `Status` is read from each page's prose: **stable** (wired and unlikely to

Paths are relative to the owner package (`sdks/python/agenta/sdk/`, `services/agent/src/`,
`services/oss/src/`, `api/oss/src/`); test paths are relative to each package's pytest root
unless prefixed. The `/load-session` shell endpoint is intentionally omitted: it is being
removed in a sibling change, so it is not listed here.
unless prefixed.

## Source of truth

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@ The Vercel adapter is the translation layer between the browser's protocol and t
runtime. It exists so Vercel names stay out of the runtime DTOs: the adapter converts
`UIMessage[]` to neutral `Message[]` on the way in, and neutral `AgentEvent`s to Vercel UI
Message Stream parts on the way out. It owns the public [`/messages`](../public-edge/agent-messages.md)
and [`/load-session`](../public-edge/agent-load-session.md) contracts, so a change here is
usually a change a browser will notice.
contract, so a change here is usually a change a browser will notice.

The adapter's place in the layering is narrated in
[Ports and adapters](../../documentation/ports-and-adapters.md#browser-protocol-adapter), and
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,9 @@ All in `interfaces.py`. Each has a narrow job:
- **`Harness`**: maps a neutral `SessionConfig` to a harness-specific config and runs turns.
The one abstract method is `_to_harness_config(config) -> HarnessAgentConfig`; the base
supplies `prompt`, `stream`, and `create_session`.
- **`SessionStore`**: durable message history. `load(session_id)` and
`save_turn(session_id, *, messages, result)`. Only `NoopSessionStore` is wired today, so
history is not persisted yet (see [Agent load session](../public-edge/agent-load-session.md)).

There is no durable-history port: the runtime is cold and the client resends full history
each turn (see [Sessions](../../documentation/sessions.md)).

## Owned by

Expand All @@ -54,7 +54,8 @@ All in `interfaces.py`. Each has a narrow job:
(full history every turn) hang off these ports.
- **Session teardown.** `destroy()` defaults to a no-op; an implementation that holds
resources has to override it.
- **Durable history.** Wiring a real `SessionStore` changes the `/load-session` contract.
- **Durable history.** Persisting session history would mean adding a new store port and a
history-load endpoint; neither exists today.
- **Harness responsibilities.** `_to_harness_config` is the single point a new harness has to
implement.

Expand Down
2 changes: 0 additions & 2 deletions docs/design/agent-workflows/interfaces/public-edge/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ a reason not to.
## Interfaces

- [Agent messages](agent-messages.md): the streaming browser chat contract.
- [Agent load session](agent-load-session.md): the history-resume contract, not yet wired
to durable storage.
- [Workflow invoke](workflow-invoke.md): the generic batch invocation envelope.
- [Workflow inspect](workflow-inspect.md): the schema the playground reads to build the
config form.
Expand Down
Loading
Loading