Multi Agent Testing

Multi-Agent Testing

The application is designed from the ground up to support testing multiple Copilot Studio agents simultaneously.

Concepts

Agent Registry

Each agent is an independent configuration record containing:

Direct Line credentials (secret, bot ID, transport preferences)
Per-agent Judge settings (endpoint, API key, model, temperature, pass threshold) — or falls back to global defaults
Per-agent question generation settings — or falls back to global defaults
Environment tag (dev, staging, production)
Timeout and retry policies

Suite-Agent Mapping

A test suite can be associated with one or more agents via a many-to-many relationship. Running a suite against multiple agents launches a separate execution per agent, tracked individually.

Execution Coordinator

MultiAgentExecutionCoordinator orchestrates parallel execution:

Fetches the suite-agent mappings
Spawns a TestExecutionService instance per agent concurrently
Enforces per-agent rate limits and concurrency controls
Isolates errors — one agent failing does not stop the others
Writes a Run record per agent with all results and AgentId attribution

Setting Up Multiple Agents

Via the Web UI

Navigate to Agents in the sidebar
Click New Agent
Fill in the agent's name, environment tag, Direct Line secret, bot ID, and Judge settings
Click Save
Repeat for each additional agent (dev, staging, production, regional variants, etc.)

Associate Agents with a Test Suite

Navigate to Test Suites
Click Edit on a suite
Go to the Agents tab
Check all agents you want the suite to run against
Save

Running Against Multiple Agents

Go to Test Suites
Click Run on your suite
The coordinator runs the suite against all associated agents in parallel
The Dashboard shows one run entry per agent

Comparing Agent Results

The Runs page shows:

Column	Description
Suite	Test suite name
Agent	Which agent was tested
Environment	dev / staging / production
Pass Rate	Percentage of passed test cases
Avg Latency	Mean response time
Status	running / completed / failed

Use this to:

A/B test two agent configurations side-by-side
Cross-environment validation — ensure staging matches production
Version comparison — validate a new agent version before promotion

Per-Agent Configuration Resolution

When a test run is executed, AgentConfigurationService resolves settings in this order:

Agent-specific setting (stored on the Agent entity)
Global default (from the global configuration / Settings page)

This means you can configure most agents to use global defaults and only override specific settings (e.g., a different judge model for one agent).

Use Cases

Cross-Environment Testing

Associate a suite with your dev, staging, and production agents. Run once to verify the same behavior across all three.

Suite: "Customer FAQ Regression"
  → Agent: Production Bot    (production)
  → Agent: Staging Bot       (staging)
  → Agent: Dev Bot           (dev)

A/B Testing

Create two agent configurations with different prompts or knowledge bases and run the same suite to compare quality scores.

Regional Deployment

Tag agents by region (e.g., West Europe, East US) and run the same suite to detect regional inconsistencies.

Release Validation

Before promoting a new agent version, run all regression suites against it and compare with the previous version's run history.

Sample Data

The TestDataSeeder can seed sample multi-agent data for demonstrations:

Three sample agents: Production, Staging, Development
Pre-configured test suites with agent associations

This is useful for exploring the UI without real agent credentials.

Migration from Single-Agent Version

If upgrading from a previous single-agent version of the application:

Database migration is automatic — the schema migrates on first startup; no manual steps required.
Existing test suites can be associated with newly created agents via the Test Suites → Edit → Agents tab.
Global Judge settings are preserved and can be selectively overridden per agent.
API backwards compatibility — agents are optional parameters; existing API calls continue to work and fall back to the global Direct Line configuration.

What Changed

Area	Change
`Agent` entity	New table with per-agent Direct Line, Judge, and question generation settings
`TestSuiteAgent`	New many-to-many join table linking suites to agents
`Run` entity	Now records `AgentId` so results are attributed per agent
`MultiAgentExecutionCoordinator`	New service orchestrating parallel runs across all associated agents
`AgentConfigurationService`	Resolves effective settings: agent-specific → global default
Web UI	Agent management pages, updated Setup Wizard, agent selection in suite/run flows

Multi Agent Testing

Multi-Agent Testing

Concepts

Agent Registry

Suite-Agent Mapping

Execution Coordinator

Setting Up Multiple Agents

Via the Web UI

Associate Agents with a Test Suite

Running Against Multiple Agents

Comparing Agent Results

Per-Agent Configuration Resolution

Use Cases

Cross-Environment Testing

A/B Testing

Regional Deployment

Release Validation

Sample Data

Migration from Single-Agent Version

What Changed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Navigation

Getting Started

Core Features

Configuration

Authentication

Reference

Deployment

Clone this wiki locally