Skip to content

feat(sdk): add annotate_trace platform-op#4999

Draft
mmabrouk wants to merge 1 commit into
big-agentsfrom
feat/annotate-trace-op
Draft

feat(sdk): add annotate_trace platform-op#4999
mmabrouk wants to merge 1 commit into
big-agentsfrom
feat/annotate-trace-op

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jul 1, 2026

Copy link
Copy Markdown
Member

What

Adds an annotate_trace entry to PLATFORM_OPS so a running agent can record an annotation (evaluation feedback) on its own current run's trace — "grade myself". This closes the last gap for the self-reflecting-agent use case (builder-agent-reliability use case 3): the annotations REST API and the run-context trace plumbing already existed; only the catalog entry was missing.

How it mirrors commit_revision

annotate_trace is a mutating, self-targeting platform-op built exactly like commit_revision:

  • Wraps POST /api/annotations/ (args_into="annotation", request wrapper is {"annotation": AnnotationCreate}).
  • The run's own trace_id/span_id are bound from run context and hidden from the model, so the agent can only ever annotate its OWN trace:
    context_bindings={
        "annotation.links.invocation.trace_id": "$ctx.trace.trace_id",
        "annotation.links.invocation.span_id": "$ctx.trace.span_id",
    }
    
    (commit_revision binds $ctx.workflow.variant.id the same way.)
  • Model-visible schema is closed (additionalProperties: False) and exposes only references.evaluator.slug + data.outputs. links is never model-supplied.

Assembled body:

{
  "annotation": {
    "references": { "evaluator": { "slug": "self_reflection" } },
    "data": { "outputs": { "score": 0.9, "note": "" } },
    "links": { "invocation": { "trace_id": "<run's own>", "span_id": "<run's own>" } }
  }
}

Permission default

Leans allow / no-approval — annotating your own trace is additive self-metadata (unlike commit_revision, which mutates config and defaults to ask). Authors can override per-config, and ask is the documented alternative (a brand-new evaluator slug auto-creates an evaluator artifact).

Files

  • sdks/python/agenta/sdk/agents/platform/op_catalog.py — the op + its description/schema constants.
  • sdks/python/oss/tests/pytest/unit/agents/platform/test_op_catalog.py — added to the catalog set + two focused tests (self-target bind + hidden links; auto-allow default).
  • docs/design/agent-workflows/projects/builder-agent-reliability/annotate-op/README.md — design doc.

The build-kit overlay enumerates PLATFORM_OPS dynamically, so the op is offered to builder agents automatically.

Verification

  • pytest oss/tests/pytest/unit/agents/platform/ — 87 passed.
  • API test_build_kit_overlay.py — 4 passed.
  • ruff format + ruff check --fix clean.

Review asks

  • Confirm the $ctx.trace.* token names ($ctx.trace.trace_id / .span_id) — the load-bearing binding. Verified against RunContextTrace, RunContext.to_wire(), and resolveCtxToken in the TS runner.
  • Confirm the annotation body shape ({"annotation": {data.outputs, references.evaluator, links.invocation.{trace_id,span_id}}}) — verified against the router request model and the annotations acceptance test.
  • Sign off on the allow permission default vs ask.

Draft — do not merge.

https://claude.ai/code/session_01WSp2LqKrEtXnm2fsPWuQWa

@vercel

vercel Bot commented Jul 1, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jul 3, 2026 9:44am

Request Review

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 972816e8-f11e-4de9-89e2-1d4df4fcd7cc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/annotate-trace-op

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mmabrouk

mmabrouk commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

Adversarial review of annotate_trace

I verified the tokens, the body shape (live on bighetzner), and the tests, and I stress-tested the self-target guarantee. One real defect, ranked first.

Verdict: needs changes

The op works for a cooperative model, but its headline guarantee ("can never retarget another trace, the identical guarantee commit_revision gives") is not actually enforced. I proved a retarget live.

1. Self-targeting is NOT airtight (sibling links key) — proven live, MEDIUM

The binding owns only two leaves: annotation.links.invocation.trace_id and .span_id. But links is a Dict[str, Link] map, and the server turns EVERY key into an OTel link (build_otel_links, core/tracing/utils/traces.py), and annotations are retrieved by matching ANY link's trace_id (the field="links" filter). So a sibling link key retargets.

In the runner, assembleBody (services/agent/src/tools/direct.ts) deletes+sets only the two invocation leaves and passes the rest of the model's args through under annotation with no schema validation (relay.ts:211 hands req.args straight in; there is no ajv/additionalProperties enforcement anywhere server-side). So a model that emits:

"links": { "invocation": {...}, "x": { "trace_id": "<other-trace>", "span_id": "<other-span>" } }

keeps the x sibling. additionalProperties: False in the model-visible schema is the ONLY thing that would stop it, and that is a model-cooperation assumption, not an enforced boundary.

Proven live on bighetzner: I posted an annotation whose invocation link points at trace A and a smuggled sibling link at trace B, then queried annotations for trace B. It returned the smuggled annotation (count: 2, present: true). So an agent can attach arbitrary scores/labels to another run's trace in its own project.

This is the asymmetry vs commit_revision: that op binds a single scalar target the runner overwrites unconditionally, so a non-compliant model still can't retarget (the runner test at tool-direct.test.ts:241 proves exactly this). annotate_trace binds a container whose siblings the runner never clears, so the same test would not hold for it. Scope is same-project (project_id comes from auth, not the model), so this is an eval-data-integrity gap, not cross-tenant.

Fix (make it airtight, pick one):

  • (a) Make the binding own the whole annotation.links object, not two leaves. Needs run context to expose a pre-shaped {"invocation": {"trace_id", "span_id"}} object token, so assembleBody can delete annotation.links wholesale and set the bound object. That is a small wire/service addition (it gives up the "no new server code" property).
  • (b) Add a runner primitive that clears a subtree before filling bound leaves (e.g. an op field listing body paths the runner deepDeletes wholesale, then the two scalar bindings refill links.invocation.*). No wire change, generic, airtight. This is my preference.

Until one lands, at minimum flip the default to ask and soften the description/doc that assert an unconditional "you cannot annotate a different trace."

2. Tokens resolve — YES

$ctx.trace.trace_id / .span_id are correct. RunContextTrace -> RunContext.to_wire() emits trace: {trace_id, span_id} -> shipped as runContext (wire.py:141, app.py:253) -> resolveCtxToken walks runContext.trace.trace_id. The service fills it from the active OTel span (_run_context_trace). A missing value fails closed (throws -> tool error), so the op never annotates an empty/wrong trace. Good.

3. Body shape — accepted live, YES

POST /api/annotations/ with {"annotation": {references.evaluator.slug, data.outputs, links.invocation.{trace_id,span_id}}} returned 200, and the created annotation's links.invocation correctly records the target (retrievable by querying annotations linked to that trace; not detached). Pydantic accepts it (SimpleTraceCreate: data=Dict, references optional-evaluator Reference, links=Dict[str,Link]). A new slug auto-creates the evaluator (variant + revision minted, confirmed in the response).

4. Tests meaningful — YES, but the hole is untested

The two tests check the declared binding, hidden links, closed top-level schema, and the allow/no-approval default (not just "op exists"). But they assert additionalProperties is False as if that is the guarantee, and neither the SDK tests nor the runner tests cover the sibling-link case (the runner's override test only covers a same-path scalar collision). The actual gap in #1 is tested nowhere.

5. Other

  • Permission allow: defensible for the stable-slug flow, but auto-allow + auto-evaluator-creation lets the agent spawn arbitrary evaluator artifacts by varying the slug, and (given dashboard setup #1) annotate other traces without approval. Fixing dashboard setup #1 makes allow clearly fine; otherwise ask is safer.
  • Description overstates the guarantee to the model. Do not promise an invariant the runtime does not hold.
  • _strip_field is a no-op for this op: the binding paths carry an annotation. prefix (from args_into) that does not exist in the schema tree, so the doc's "stripped from the schema" mechanism never runs here. Harmless (closure is what hides links), but the doc credits a defense that is inert for this op.

@mmabrouk

mmabrouk commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

Design-doc-only PR (the implementation code is being stripped; this is the annotate_trace design for your review).

Feedback I need before implementing:

  1. OK to use a reserved "agent self-reflection" evaluator with a permissive schema, bound server-side, rather than reusing the default quality-rating evaluator (its schema is a rigid {approved: boolean}) or letting the model name one (that sprays evaluators and 422s on the second write)?
  2. The three open questions in the doc: (a) the reserved evaluator's name + how it's materialized (seed-per-project with a backfill migration / ensure-exists at annotation time / a virtual code-defined reserved evaluator); (b) how permissive the output schema should be (fully open vs a light score+notes); (c) whether repeated calls on the same trace should upsert one reflection annotation instead of appending.

Approve the design (or answer the three) and I'll implement.


- Reserve a single evaluator. Proposed slug: `agent-self-reflection` (final naming is open,
see status.md).
- Give it a **permissive** outputs schema so freeform reflection always validates and the shape

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, check whether something like this is allowed to have something like a really object without any additional property. I think it's preferable to have a simple schema and then have some kind of modal or additional stuff where it's an object when you can add stuff. Basically, you have a string with reflection and then something like a score or judgment, which is binary, like good or bad, so that you can filter with success or not, successful or not. In addition to that, you would add something additional, kind of meta quirks or whatever, that could be an object. In this way, you'd expect the reflection to be more structured. Also, we can't show it in the UI, because otherwise it would become blending in both senses.

This also means that we need to add this kind of evaluator by default when creating a project or whatever, and probably also add a migration for it, the same way we do with our evaluator quality rating.


This is an implementation choice, resolved at build time, not by the op contract:

1. **Seed it per project**, exactly like `quality-rating`: add a preset on the

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

`agenta:custom:feedback:v0` catalog entry plus an entry in `_DEFAULT_EVALUATORS`. New
projects get it for free. Existing projects need a one-time backfill migration (the same
shape that backfilled `exact-match` / `contains-json`).
2. **Ensure-exists at annotation time**: when an annotation targets the reserved slug and no

@mmabrouk mmabrouk Jul 1, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should not do this, but we should add it maybe to the skill as a kind of resource for fallbacks, like troubleshooting. If you get an error that the evaluator does not exist, then create an evaluator using this, and here's the scheme of the evaluator you need to create, something like that.

code-defined reserved evaluator (mirroring the reserved `_agenta.*` platform-skills pattern) is
a later consolidation if reserved artifacts multiply.

## 2. The self-targeting guarantee

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this section. Please rewrite it with more context. What do you mean by an invariant and the server-owned annotation links? Why are we talking about this? Is it how the request should be or what?

}
```

The model never sees `references` or `links`.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense.


## Permission

Default `allow` / no approval (`default_permission="allow"`,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

approval on every self-reflection would defeat the autonomous use case. Authors can still
override per config (`permission` / `needs_approval` on `PlatformToolConfig`).

One residual risk: each call still writes an annotation trace, so a runaway loop could spam

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it's not a problem.

@mmabrouk mmabrouk left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update according to the review. Please add inline comments for your updates and answer my comments, and then have a high-level comment saying how you updated this PR and if there are any open questions.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 3020a440-9a12-4503-8df9-65b344adaf15

📥 Commits

Reviewing files that changed from the base of the PR and between 51af4c3 and 9d67891.

📒 Files selected for processing (5)
  • docs/design/agent-workflows/projects/builder-agent-reliability/annotate-op/README.md
  • docs/design/agent-workflows/projects/builder-agent-reliability/annotate-op/context.md
  • docs/design/agent-workflows/projects/builder-agent-reliability/annotate-op/plan.md
  • docs/design/agent-workflows/projects/builder-agent-reliability/annotate-op/research.md
  • docs/design/agent-workflows/projects/builder-agent-reliability/annotate-op/status.md

Comment on lines +140 to +146
Default `allow` / no approval (`default_permission="allow"`,
`default_needs_approval=False`).

With a reserved, pre-existing evaluator there is no evaluator-creation side effect per call.
The op writes additive self-metadata onto the agent's own trace and nothing else. Requiring
approval on every self-reflection would defeat the autonomous use case. Authors can still
override per config (`permission` / `needs_approval` on `PlatformToolConfig`).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Don't default to allow until the enforcement path exists.

This choice depends on the runner-side subtree replacement / arg stripping work later in the plan. If the op becomes callable before that lands, the self-targeting gap reopens.

Suggested edit
- Default `allow` / no approval (`default_permission="allow"`,
- `default_needs_approval=False`).
+ Default `ask` until the runner-side enforcement lands (`default_permission="ask"`,
+ `default_needs_approval=True`).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Default `allow` / no approval (`default_permission="allow"`,
`default_needs_approval=False`).
With a reserved, pre-existing evaluator there is no evaluator-creation side effect per call.
The op writes additive self-metadata onto the agent's own trace and nothing else. Requiring
approval on every self-reflection would defeat the autonomous use case. Authors can still
override per config (`permission` / `needs_approval` on `PlatformToolConfig`).
Default `ask` until the runner-side enforcement lands (`default_permission="ask"`,
`default_needs_approval=True`).
With a reserved, pre-existing evaluator there is no evaluator-creation side effect per call.
The op writes additive self-metadata onto the agent's own trace and nothing else. Requiring
approval on every self-reflection would defeat the autonomous use case. Authors can still
override per config (`permission` / `needs_approval` on `PlatformToolConfig`).

@mmabrouk mmabrouk removed the needs-review Agent updated; awaiting Mahmoud's review label Jul 1, 2026
@mmabrouk mmabrouk force-pushed the feat/annotate-trace-op branch from 9d67891 to 4897638 Compare July 1, 2026 18:22
@mmabrouk

mmabrouk commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

Second-round revision (design docs only, still draft)

Addressed the inline review comments. All five workspace files updated; no code.

1. Evaluator schema is now structured, not permissive. data.outputs is
{ "reflection": string, "score": <good|bad>, "meta": object }: a free-text write-up, a
binary judgment so runs can be filtered by success, and an open meta object for extras.
I verified this against the annotation UI. reflection (string) and score render as form
controls; a two-value enum renders as a select and a plain boolean is the more
battle-tested control (it is how quality-rating renders today). One finding worth your
call: an open object is filtered out of the annotation form (USEABLE_METRIC_TYPES
excludes objects), so meta is stored and enforced server-side but is not drawn as an
editable field. The design treats meta as a machine-extras overflow bucket, which is why
that is fine; if you want meta visible/editable we would give it fixed sub-properties or
store it as a JSON string. Schema and render check are in plan.md §1.

2. Seed by default plus a backfill migration, mirroring quality-rating. The reserved
agent-self-reflection evaluator is seeded on project creation via a preset and a
_DEFAULT_EVALUATORS entry, exactly like quality-rating. One correction from the research:
quality-rating has no backfill migration (it was seed-on-creation only), so the backfill
for existing projects mirrors the default-environments migration pattern instead (paginate
projects missing the evaluator, resolve the owner, idempotent create). Details and file:lines
in plan.md §1.

3. No auto-create at annotation time. Dropped the ensure-exists and virtual-evaluator
options. Instead, plan.md notes that the build-agent skill should carry a troubleshooting
fallback: if annotate_trace errors because the evaluator is missing, the skill tells the
agent how to create it and gives the schema. The skill edit is separate work.

4. Rewrote the self-target section in plain language. No more "invariant" jargon. It now
explains, from scratch, that an agent must only ever annotate its own trace, that the runner
fills the trace and span from the run's own context, that this is not enforced today (a
smuggled sibling links key retargets another trace, which we reproduced live), and that the
fix is a small runner primitive that clears the whole links subtree and then refills only
the two bound leaves from context. See plan.md §2.

5. One open question left: upsert vs append for repeated calls on the same trace. Lean is
append (each reflection is its own record); upsert stays on the table only as a way to bound a
runaway loop. In status.md.

Feedback wanted: (a) the structured schema in plan.md §1, especially score as enum [good, bad] vs a plain boolean, and whether meta as a stored-only (non-rendered) object is
acceptable; (b) the seed-plus-backfill approach and the default-environments migration as the
backfill template; (c) the plain self-target rewrite in §2; (d) the append vs upsert lean.

@mmabrouk

mmabrouk commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

Rebased onto big-agents f8765a9b89 (current base, post services/agent -> services/runner rename). Content unchanged from the prior review round; diff is exactly this project's files. Ready for review.

@mmabrouk mmabrouk force-pushed the feat/annotate-trace-op branch from 872a1b4 to d3eef05 Compare July 3, 2026 09:43
mmabrouk added a commit that referenced this pull request Jul 4, 2026
Adds the annotate_trace op to the platform op catalog: self-targeting (trace/span
bound from run context, links hidden from the model), additive self-metadata so it
defaults to auto-allow. Tests pin the self-targeting binding.

Claude-Session: https://claude.ai/code/session_01DGj7GKafjkZeQXMsryWhb2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review Agent updated; awaiting Mahmoud's review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant