feat(frontend): migrate evaluator run invocation by mmabrouk · Pull Request #3577 · Agenta-AI/agenta

mmabrouk · 2026-01-28T12:12:21Z

Summary

add workflow invoke helper for SimpleEvaluator execution
switch evaluator playground DebugSection to call /preview/workflows/invoke
update migration plan/status docs for PR2 run migration

Testing

pnpm lint-fix (warns about in web/oss/src/components/Editor/plugins/code/nodes/Base64Node.tsx)

vercel · 2026-01-28T12:12:28Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Feb 13, 2026 8:04pm

mmabrouk · 2026-01-28T12:27:50Z

Decisions & Tradeoffs

Use /preview/workflows/invoke from the evaluator playground instead of the legacy /evaluators/{key}/run. This matches the new backend path and keeps run logic aligned with SimpleEvaluator.
Fallback URI when evaluator not saved: if no config exists yet, build via . This keeps Run Evaluator working for unsaved drafts.
**Send parameters in both and **. The backend accepts both; this keeps compatibility with how workflow service requests are handled today.
Minimal response typing: only surface and since UI currently only needs evaluator outputs.

Alternatives considered

Keep legacy run endpoint (): lower change risk but blocks full migration; would keep old adapter path around.
Require saved evaluator config before run: strict but would break current UX; reduces ambiguity around URI.
**Only pass **: cleaner request but might miss expectations in existing workflow data handlers.
Expose trace_id/span_id and surface in UI: more debugging power but requires UI changes.

If you prefer a different tradeoff (e.g., require saved configs or drop ), I can adjust this PR.

devin-ai-integration

Devin Review found 1 potential issue.

View 6 additional findings in Devin Review.

...ponents/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DebugSection.tsx

web/oss/src/services/workflows/invoke.ts

…igrate-evaluator-playground-run

This reverts commit 00c2aa2.

Expose output schemas from evaluator templates and send them on config create/edit, including dynamic derivation for auto_ai_critique and json_multi_field_match. Also remove legacy /evaluators/map usage and relax config listing filters so older non-human configs remain visible.

mmabrouk · 2026-02-13T19:59:48Z

api/oss/src/core/evaluators/service.py


        evaluator_revision_slug = uuid4().hex[-12:]

+        hydrated_simple_evaluator_data = self._ensure_builtin_evaluator_data(


@jp-agenta

We kept backend hydration on purpose, even after moving schema ownership to the frontend.
This PR makes the frontend send data.schemas.outputs when it can determine the schema at configure time. For fixed evaluators, the frontend reads outputs_schema from GET /evaluators. For auto_ai_critique, it derives the schema from parameters.json_schema.schema. For json_multi_field_match, it derives the schema from parameters.fields. If an evaluator has no known schema in the template, the frontend does not send one.
We still keep _ensure_builtin_evaluator_data in SimpleEvaluatorsService as a fallback. This protects non-UI callers and older payloads that still send only uri and parameters. It also keeps behavior compatible with the legacy flow, where builtin schemas were derived on the backend. In short, frontend schema sending is now the primary path, and backend hydration is a safety net.
This also means we did not remove existing behavior for old evaluators. We added a safer default while we complete the migration and verify all clients send schemas consistently.

feat(frontend): invoke evaluators via workflows

9b9435a

vercel bot deployed to Preview January 28, 2026 12:13 View deployment

devin-ai-integration bot reviewed Feb 13, 2026

View reviewed changes

...ponents/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DebugSection.tsx Show resolved Hide resolved

devin-ai-integration bot reviewed Feb 13, 2026

View reviewed changes

web/oss/src/services/workflows/invoke.ts Show resolved Hide resolved

fix(frontend): use explicit evaluator URI for invoke

8ecf91c

vercel bot deployed to Preview February 13, 2026 14:40 View deployment

fix(frontend): harden evaluator invoke interface

23e6d62

vercel bot deployed to Preview February 13, 2026 15:05 View deployment

Merge branch 'feat/migrate-evaluator-playground-frontend' into feat/m…

dfcd091

…igrate-evaluator-playground-run

vercel bot deployed to Preview February 13, 2026 15:46 View deployment

fix(frontend): exclude container ag metrics from overview

00c2aa2

vercel bot deployed to Preview February 13, 2026 16:43 View deployment

Revert "fix(frontend): exclude container ag metrics from overview"

bef12ec

This reverts commit 00c2aa2.

vercel bot deployed to Preview February 13, 2026 16:55 View deployment

fix(api): hydrate builtin evaluator schemas on simple CRUD

fa91fcf

vercel bot deployed to Preview February 13, 2026 17:22 View deployment

mmabrouk marked this pull request as ready for review February 13, 2026 19:55

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 13, 2026

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 13, 2026

vercel bot deployed to Preview February 13, 2026 19:58 View deployment

dosubot bot added the feature label Feb 13, 2026

mmabrouk commented Feb 13, 2026

View reviewed changes

chore(api): apply ruff formatting cleanup

0c6ae55

vercel bot deployed to Preview February 13, 2026 20:04 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(frontend): migrate evaluator run invocation#3577

feat(frontend): migrate evaluator run invocation#3577
mmabrouk wants to merge 9 commits intofeat/migrate-evaluator-playground-frontendfrom
feat/migrate-evaluator-playground-run

mmabrouk commented Jan 28, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

vercel bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

mmabrouk commented Jan 28, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

Uh oh!

mmabrouk Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		evaluator_revision_slug = uuid4().hex[-12:]

		hydrated_simple_evaluator_data = self._ensure_builtin_evaluator_data(

Conversation

mmabrouk commented Jan 28, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

vercel bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mmabrouk commented Jan 28, 2026

Decisions & Tradeoffs

Alternatives considered

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mmabrouk Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmabrouk commented Jan 28, 2026 •

edited by devin-ai-integration bot

Loading

vercel bot commented Jan 28, 2026 •

edited

Loading