feat: add ToolSimulator for tool response simulation #111

ybdarrenwang · 2026-01-29T22:06:37Z

Description

Introduces ToolSimulator framework for simulating realistic tool responses during agent evaluation without calling production APIs. Enables systematic testing of agents with API-based, Python function-based, and MCP-based tools through LLM-powered dynamic simulation.

Key capabilities:

Three simulation modes: Dynamic (LLM-generated), static (predefined responses), and mock (custom functions)
Shared state management across multiple tools via share_state_id for stateful testing scenarios
Decorator-based registration for function tools (@ToolSimulator.function_tool), MCP tools (@ToolSimulator.mcp_tool), and API tools (@ToolSimulator.api_tool)
Integration with Strands Evals workflow including Experiment, Case, and evaluators
Multi-agent support with tool simulation across sub-agents (agent-as-tool pattern)

Design principles:

Centralized registry for tool management and state tracking
Context-aware response generation using initial state descriptions and conversation history
Seamless integration with existing Strands tool decorator patterns
Comprehensive unit test coverage for all simulation modes

Related Issues

#93

Documentation PR

strands-agents/docs#500

Type of Change

New feature

Testing

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

poshinchen · 2026-02-02T15:48:24Z

src/strands_evals/simulation/tool_simulator.py

+import logging
+import warnings
+from datetime import datetime
+from typing import Any, Callable, Dict, List, Optional


Can you use Python built-in collection types?

Done in the latest commit

poshinchen · 2026-02-02T20:16:38Z

src/strands_evals/simulation/tool_simulator.py

+        # Store framework selection
+        self.framework = framework
+        # Store model configuration for creating internal agents
+        self.model_id = model


nit: we could keep it as model instead of model_id.

ybdarrenwang added 9 commits January 23, 2026 22:33

init tool simulator pr

86159c2

fix tool init state registry and prompt

be24b11

support static and mock modes

1d507f0

unit test tool simulator

7b82fc8

remove override and simplify pr

cd0365a

replace llm call with agent; simplify error raise

4d57a53

refactor and address mypy errors

15d3fcd

fix tool simulator integration with strands tool decorator

9701bbe

update test

e6702f5

ybdarrenwang requested a deployment to manual-approval January 29, 2026 22:15 — with GitHub Actions Waiting

fix test

367c637

ybdarrenwang requested a deployment to manual-approval January 29, 2026 23:52 — with GitHub Actions Waiting

poshinchen reviewed Feb 2, 2026

View reviewed changes

utilize built-in collection types; improve readability

087ce6e

ybdarrenwang requested a deployment to manual-approval February 2, 2026 19:09 — with GitHub Actions Waiting

poshinchen reviewed Feb 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ToolSimulator for tool response simulation #111

feat: add ToolSimulator for tool response simulation #111

ybdarrenwang commented Jan 29, 2026

Uh oh!

poshinchen Feb 2, 2026

Uh oh!

ybdarrenwang Feb 2, 2026

Uh oh!

poshinchen Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add ToolSimulator for tool response simulation #111

Are you sure you want to change the base?

feat: add ToolSimulator for tool response simulation #111

Conversation

ybdarrenwang commented Jan 29, 2026

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

poshinchen Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

ybdarrenwang Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

poshinchen Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants