English · Português
Multi-agent team in .NET 8. A supervisor orchestrates Research, Draft, and Review agents to produce a business deliverable, and a mandatory human gate makes it structurally impossible to ship anything without approval.
A supervisor coordinates a team of specialized agents (Research, Draft, Review) to turn a single business request into a work product, here a legal pleading, and a person has to approve it before anything is emitted. The orchestration is explicit and deterministic in code, every model call sits behind a port, and the whole system builds, tests, and runs with no API key and no Docker.
This is a proof-of-concept. The build and the full test suite run offline against deterministic fakes. Real providers (Azure OpenAI / OpenAI, a RAG-backed precedent source, a real approval queue) drop in by replacing DI registrations, and the agents don't change.
- Why the human gate matters
- Two approaches to approval
- Highlights
- Architecture
- Tech stack
- Getting started
- Project structure
- What is real vs. faked
- Testing
- Architecture decisions
- Documentation
- Scope and limitations
- Related project
- References
When agents produce a business deliverable rather than developer tooling, full autonomy is the wrong target. A drafted legal pleading has to be reviewed by a person before it leaves the system, so that review is built into the structure here instead of left to convention. The workflow only reaches Emitted after an IHumanGate returns approval, and the dependency-injection setup registers no default gate on purpose: a host that forgets to wire one fails fast at resolve time instead of quietly auto-approving. That is how the workflow handles the human-supervision and source-traceability concerns CNJ Resolution 615 raises for AI in the Brazilian judiciary.
The structural gate above is one way to guarantee approval, and it is not the only one. The Microsoft Agent Framework offers human-in-the-loop through ApprovalRequiredAIFunction: opt-in, per tool, decided at runtime. This repo implements that path too, faithfully against the GA API, in an isolated AgenticWorkflow.Maf project, and compares the two honestly:
- Structural gate (this core): composition-time, fail-fast. Forget the gate and the system fails to resolve. Coarse, but no emission escapes review.
- MAF
ApprovalRequiredAIFunction: runtime, opt-in per tool. Granular and official, but a sensitive tool left unwrapped auto-invokes.
The two work as complementary layers rather than rivals. The full comparison (the trade-off table, the regulated-domain framing, and a live demo of both) is in Docs/comparison-gates.md. The MAF path needs a real model (Microsoft.Agents.AI 1.10.0, GA), so its demo and [SkippableFact] test are key-gated and isolated to the .Maf project, which leaves the core keyless and untouched.
- Nothing reaches
Emittedwithout anIHumanGateapproval. The DI setup registers no default gate on purpose, so a host that forgets to wire one fails fast at resolve time instead of auto-approving. The fail-safe is the absence of a default, not a flag someone has to remember to set. - A supervisor drives the Research, Draft, and Review agents through an explicit sequence: research first, then a draft/review loop capped at
MaxDraftAttempts(default 2), then the human gate. - Citations flow through the whole run: research notes carry
[Fonte: id]citations, the draft uses only the research and carries them forward, and the Supervisor enforces as an invariant that any draft without a citation never reaches the human gate, regardless of which review agent is wired. Citation presence is an orchestrator guarantee; whether the citation is the right one stays a review concern. - Each agent is a thin role over
Microsoft.Extensions.AI'sIChatClient, the abstraction Semantic Kernel and the Microsoft Agent Framework both build on, so the same shape maps onto their orchestration without coupling to a preview API. - The sequence runs as deterministic code rather than a model-driven loop, so every run is reproducible and the decision path is recorded in a step trace.
- Deterministic in-process fakes sit behind every port (chat model, precedent source, gate), so the build and the full test suite run with no key and no Docker. Real providers drop in by replacing DI registrations.
flowchart TB
IN(["task"]) --> SUP["Supervisor"]
SUP --> RES["Research agent"]
RES --> DRA["Draft agent"]
DRA --> REV["Review agent"]
REV -- "revise (up to MaxDraftAttempts)" --> DRA
REV --> GATE{"Human gate"}
GATE -- approved --> EM(["Emitted"])
GATE -- rejected --> RJ(["RejectedByHuman"])
REV -- "rejected after cap" --> RR(["RejectedByReview"])
The orchestration is a deterministic, auditable sequence in code: research, then a draft/review loop capped at MaxDraftAttempts, then the human gate. It isn't a fully model-driven loop, because a legal workflow has to be reproducible and reviewable. Each agent is a thin role over an IChatClient, so the same shape maps onto Semantic Kernel and the Microsoft Agent Framework without coupling to a preview API. Every model call, the precedent source, and the human gate sit behind ports with deterministic in-process fakes, which is what lets the build and the full test suite run offline.
Outcomes: Emitted (human approved), RejectedByHuman (human declined), and RejectedByReview (review never approved within the attempt cap, so the human gate is never reached).
| Concern | Choice |
|---|---|
| Language / runtime | .NET 8 (C# 12) |
| Agent abstraction | Microsoft.Extensions.AI (IChatClient) |
| Composition | Microsoft.Extensions.DependencyInjection + Microsoft.Extensions.Logging |
| Chat model | canned deterministic IChatClient (keyless); real Azure OpenAI / OpenAI by configuration |
| Precedent source | in-memory catalog; swap for a RAG-backed source behind the port |
| Testing | xUnit + FluentAssertions |
Built with the .NET 10 SDK, targeting net8.0.
Prerequisite: .NET 8 SDK (or the .NET 10 SDK, since the project targets net8.0). No key, no Docker.
# Run the demo: both approval approaches side by side (structural gate keyless; MAF key-gated)
dotnet run --project src/AgenticWorkflow.Demo
# Tests (orchestration order, redraft-on-rejection, max-attempts cap, human gate)
dotnet testThe demo runs the same task through both approval approaches (see Two approaches to approval):
Path 1 — structural gate (DI fail-fast), keyless
forgot the gate -> throws at resolve (InvalidOperationException) — cannot auto-approve
with gate (happy path) -> Emitted (research -> draft -> review: approved -> human gate: approved)
with gate (redraft) -> Emitted (two draft attempts before approval)
with gate (rejected) -> RejectedByHuman (review approved; human gate declined — nothing emitted)
Path 2 — MAF ApprovalRequiredAIFunction (opt-in per tool), runtime
(no key — Path 1 needs none; add appsettings.secrets.local.json to light up Path 2)
With a key, Path 2 shows the wrapped sensitive tool pausing for approval and the unwrapped one auto-invoking. Nothing reaches Emitted in Path 1 without the gate, and citations flow from research into the draft (for example [Fonte: STJ-Tema-566]).
services.AddAgenticWorkflow(); // canned IChatClient + in-memory precedents (keyless)
// NOTE: registers no IHumanGate on purpose — wire a real one
var orchestrator = sp.GetRequiredService<IWorkflowOrchestrator>();
var result = await orchestrator.RunAsync("Elaborar manifestação sobre prescrição intercorrente");
// result.Status (Emitted / RejectedByHuman / RejectedByReview), result.Draft, result.TraceFor demos and tests, AddAgenticWorkflowDemo() is an opt-in that also wires an auto-approving gate.
Registration changes only. The agents and the supervisor stay the same:
- Real model: register a real
IChatClient(Azure OpenAI / OpenAI) in place of the canned client. - Real research: register an
IPrecedentSourcebacked by retrieval (RAG) instead of the in-memory catalog — for example the lexrag-dotnet companion. - Real approval: register an
IHumanGatethat actually asks a person or posts to an approval queue, instead of the auto-approve gate.
src/
AgenticWorkflow/
Abstractions.cs ports: IAgent, IPrecedentSource, IHumanGate, IWorkflowOrchestrator
Models.cs WorkflowState, WorkflowResult, ReviewVerdict, WorkflowStatus
Agents/Agents.cs ResearchAgent / DraftAgent / ReviewAgent over IChatClient
Orchestration/Supervisor.cs explicit research -> (draft <-> review) -> human gate
Precedents/InMemoryPrecedentSource.cs in-memory precedent catalog (swap for RAG)
Precedents/JsonFilePrecedentSource.cs file-backed adapter; drop-in replacement behind IPrecedentSource (ADR-0006)
HumanGate/HumanGates.cs AutoApproveGate (demo) and DelegateHumanGate
Ai/CannedAgentChatClient.cs deterministic keyless IChatClient for the three roles
ServiceCollectionExtensions.cs DI wiring (no default IHumanGate by design)
AgenticWorkflow.Demo/ console demo: happy path, redraft path, and human-rejected path
AgenticWorkflow.Maf/ MAF ApprovalRequiredAIFunction demo (key-gated, isolated)
tests/
AgenticWorkflow.Tests/ xUnit suite over the orchestration and the gate
AgenticWorkflow.Maf.Tests/ opt-in real-model approval tests (key-gated, skippable)
Real: the supervisor orchestration (the sequence, the draft/review loop, the attempt cap), the human gate that blocks emission, citation propagation from research into the draft, the structural rejection of an uncited draft (enforced by the Supervisor as an invariant, independent of the review agent), and review-verdict parsing.
Faked, behind ports and swapped by configuration: the three agents call CannedAgentChatClient, a deterministic in-process IChatClient that plays each role without a key. A real IChatClient (Azure OpenAI / OpenAI) drops in with no change to the agents. The precedent source is an in-memory catalog you can swap for a RAG-backed source, and the gate used by the demo auto-approves, so swap it for a real human approval step.
20 xUnit tests cover the orchestration and the gate across four categories:
- Orchestration outcomes (happy path, human rejection, redraft-then-approval, persistent rejection at max attempts, cancellation, zero-attempt guard): the supervisor reaches the correct terminal status in every scenario.
- Citation invariants (review rejects an uncited draft without calling the model; supervisor blocks an uncited draft even when the review agent would approve): citation presence is structural, not a model concern.
- Trace order (golden-trace for the happy path and the redraft path): the sequence is deterministic and phases are recorded in exact order.
- Adapters (JsonFilePrecedentSource loads a file, respects the top-N cap, ranks by relevance, and registers correctly via AddJsonPrecedentSource): the Ports-and-Adapters contract is load-bearing, not aspirational.
The tests substitute the chat client (canned or a function-backed fake) and the gate, so the whole suite runs deterministically with no key and no network. A separate AgenticWorkflow.Maf.Tests project adds real-model approval tests that are key-gated and skip cleanly when no key is present.
The decisions behind the design are recorded as ADRs in Docs/adr/. Each follows Context, Decision, Consequences, Alternatives, and the conditions that would make us revisit it.
| # | Decision |
|---|---|
| 0001 | Explicit supervisor orchestration over a model-driven agent loop |
| 0002 | Mandatory human gate with no default registration (fail-fast) |
| 0003 | Ports + deterministic keyless fakes (offline build and tests) |
| 0004 | Citation propagation through the agent chain |
| 0005 | Bounded draft/review loop with terminal outcomes |
| 0006 | Precedent source behind a port (RAG-backed in production) |
| 0007 | Human-in-the-loop conformance (CNJ Res. 615 + LGPD) |
Docs/architecture.md— components, ports, orchestration, outcomes, and limitations.Docs/requirements.md— functional and non-functional requirements.Docs/adr/— the architecture decision records above.Docs/comparison-gates.md— the structural gate vs. the Microsoft Agent Framework approval path, compared honestly.
This is a proof-of-concept built to show the orchestration pattern for an agent that produces a business deliverable, not a complete legal product. The canned chat client is a deterministic stand-in for an LLM, the in-memory precedent source is a lexical stand-in for retrieval, and the precedent catalog is small and illustrative. The orchestration is deliberately explicit rather than model-driven. The path to production for each faked component is a configuration swap behind the existing ports.
I built this as a companion to lexrag-dotnet, to work through a different question: how do you let agents produce a real deliverable without letting them ship it on their own? The decision I care most about is the smallest one. The DI container registers no default IHumanGate, so forgetting to wire one is a startup failure rather than a silent auto-approval. The fail-safe is the absence of a default, which is harder to undo by accident than a flag someone has to set.
I kept the orchestration as plain deterministic code instead of a model-driven loop, because a legal workflow has to be reproducible and reviewable step by step. Every agent talks to an IChatClient, so the canned keyless fake and a real Azure model run the same code path. I also implemented the Microsoft Agent Framework's approval primitive in an isolated project and compared the two approaches honestly instead of strawmanning the one I didn't pick.
What I'd change with more time: every agent shares one IChatClient, where keyed DI per role would be cleaner, and the review enforces citation presence but not correctness, so a real model is still where correctness would have to live.
This is one of a pair. lexrag-dotnet is the retrieval companion: a legal RAG system with hybrid search and grounded, cited answers, and the natural IPrecedentSource behind this agent's research step. Both repos take the same approach to two different problems (retrieval and agent orchestration). The domain core has no external dependencies, every model sits behind a port with a deterministic keyless fake, the flow is explicit and auditable instead of a model-driven loop, answers carry [Fonte: id] citations, and both are framed around CNJ Resolution 615 for AI in the Brazilian judiciary.
- Semantic Kernel — multi-agent orchestration (
learn.microsoft.com/semantic-kernel/frameworks/agent/agent-orchestration): sequential, concurrent, handoff, group-chat, and magentic patterns. Microsoft.Extensions.AI— theIChatClientabstraction used here, shared by Semantic Kernel and the Microsoft Agent Framework.