Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,8 +247,11 @@ The library is designed to be modular. Here are some common extension points:
| **New Planners, Executors, etc.** | Create your own implementations of `Plan`, `ExecuteStep`, `Reflect`, or `SummarizeResult` to invent new reasoning capabilities, then compose them in a `SequentialReasoner`. |
| **Pre-process or validate goals** | Create a class that inherits from `BaseGoalPreprocessor` and pass it to `StandardAgent`. Use this to resolve conversational ambiguities, check for malicious intent, or sanitize inputs. |

For a guided walk-through of real `JustInTimeToolingBase` implementations—including pure Python utilities, HTTP APIs, and shell integrations—see [`docs/tool_integration_examples.md`](docs/tool_integration_examples.md).


## Roadmap

We welcome all help implementing parts of the roadmap, or contributing new ideas. We will merge anything we think makes sense in this core library, and will link to all other relevant work.

- Additional pre-built reasoner implementations (ReAct, ToT, Graph-of-Thought)
Expand Down
140 changes: 140 additions & 0 deletions docs/tool_integration_examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Tool Integration Examples

This guide walks through three ways to plug custom tools into the Standard Agent framework. Each example
corresponds to real code under `examples/tools/integration/` and is exercised by automated tests so you can trust
that the snippets work end to end.

> ℹ️ If you are new to the tool interfaces, start by reviewing `agents/tools/base.py`. The `ToolBase` class describes
> what an individual tool exposes, while `JustInTimeToolingBase` defines the provider that discovers, loads, and
> executes tools at runtime.

- [Example 1 – Local temperature converter](#example-1--local-temperature-converter)
- [Example 2 – REST weather lookup](#example-2--rest-weather-lookup)
- [Example 3 – Allow-listed shell commands](#example-3--allow-listed-shell-commands)

All three follow the same pattern:

1. Describe the tool (`ToolBase` implementation) so the LLM knows the capability and parameters.
2. Expose the tool through a provider (`JustInTimeToolingBase`). The provider decides when the tool is returned from
`search`, how it is hydrated in `load`, and how to execute it safely.
3. Wire the provider into `StandardAgent` together with the LLM, memory, and reasoner of your choice.

---

## Example 1 – Local temperature converter

**Goal:** keep the entire computation inside the agent process. This is ideal for pure functions, feature toggles, or
anything that does not need I/O.

**Code reference:** [`examples/tools/integration/local_temperature.py`](../examples/tools/integration/local_temperature.py)

### Step-by-step (local tool)

1. **Describe the tool.** `TemperatureConversionTool` states what the tool does and which parameters (`value`,
`from_unit`, `to_unit`) are expected.
2. **Expose the tool.** `LocalTemperatureTools.search` only returns the converter for relevant queries. `execute`
performs the unit conversion in pure Python.
3. **Use the tool in an agent.**

```python
from agents.llm.litellm import LiteLLM
from agents.memory.dict_memory import DictMemory
from agents.reasoner.react import ReACTReasoner
from agents.standard_agent import StandardAgent

from examples.tools.integration.local_temperature import LocalTemperatureTools

llm = LiteLLM(model="gpt-4o-mini")
tools = LocalTemperatureTools()
memory = DictMemory()
reasoner = ReACTReasoner(llm=llm, tools=tools, memory=memory)

agent = StandardAgent(llm=llm, tools=tools, memory=memory, reasoner=reasoner)
```

**Test coverage:** `tests/examples/test_tool_integration_examples.py::test_local_temperature_tool_executes` ensures the
conversion returns the expected value.

---

## Example 2 – REST weather lookup

**Goal:** forward parameters to an HTTP API and map the response back into structured data the agent can reason about.

**Code reference:** [`examples/tools/integration/weather_api.py`](../examples/tools/integration/weather_api.py)

### Step-by-step (REST tool)

1. **Describe the REST call.** `WeatherAPITool` documents the required `location` argument and optional `units`.
2. **Create a provider that knows how to talk to the API.** `WeatherAPIClient` injects a `requests.Session` (or any
drop-in implementation) and handles authentication, timeouts, and response parsing.
3. **Execute safely.** `execute` builds the query string, raises if `location` is missing, and returns a normalized
dictionary with `location`, `temperature`, `conditions`, and the raw payload for debugging.

```python
from agents.llm.litellm import LiteLLM
from agents.memory.dict_memory import DictMemory
from agents.reasoner.rewoo import ReWOOReasoner
from agents.standard_agent import StandardAgent

from examples.tools.integration.weather_api import WeatherAPIClient

llm = LiteLLM(model="gpt-4o-mini")
tools = WeatherAPIClient(base_url="https://api.example.com", api_key="YOUR_KEY")
memory = DictMemory()
reasoner = ReWOOReasoner(llm=llm, tools=tools, memory=memory)

agent = StandardAgent(llm=llm, tools=tools, memory=memory, reasoner=reasoner)
```

**Testing strategy:** the suite fakes the HTTP client so no real network access is required. See
`tests/examples/test_tool_integration_examples.py::test_weather_api_tool_executes`.

---

## Example 3 – Allow-listed shell commands

**Goal:** expose carefully curated shell commands (or other system integrations) while keeping guardrails in place.

**Code reference:** [`examples/tools/integration/shell_command.py`](../examples/tools/integration/shell_command.py)

### Step-by-step (system tool)

1. **Define the metadata.** `ShellCommandTool` declares that the tool executes a command with optional arguments and a
timeout.
2. **Create a provider with safety checks.** `ShellCommandTools` accepts an allow-list and a `runner` callable so you
can inject a stub for testing. The helper `format_command` renders a user-friendly command string for prompts.
3. **Execute with a controlled runtime.** Only commands present in the allow-list are executed via `subprocess.run` with
captured stdout/stderr.

```python
from agents.llm.litellm import LiteLLM
from agents.memory.dict_memory import DictMemory
from agents.reasoner.react import ReACTReasoner
from agents.standard_agent import StandardAgent

from examples.tools.integration.shell_command import ShellCommandTools

llm = LiteLLM(model="gpt-4o-mini")
tools = ShellCommandTools(allow_list=["uptime", "echo"])
memory = DictMemory()
reasoner = ReACTReasoner(llm=llm, tools=tools, memory=memory)

agent = StandardAgent(llm=llm, tools=tools, memory=memory, reasoner=reasoner)
```

**Test coverage:** `tests/examples/test_tool_integration_examples.py::test_shell_command_tool_executes_when_allowed`
verifies the allow-list logic and output handling.

---

## Tips for your own integrations

- Keep provider constructors injectable (e.g., accept an HTTP session or subprocess runner). This makes them testable and
easier to reuse in other environments.
- Return structured dictionaries from `execute`. The reasoners and summarizer prompts work best when they receive clean
JSON-like objects rather than large strings.
- Lean on the agent’s memory for state. You can store intermediate results or rate-limiting information in the
`MutableMapping` passed to `StandardAgent`.
- Add tests! All examples above have focused unit tests that validate the happy path and failure modes without relying on
network or system side effects.
1 change: 1 addition & 0 deletions examples/tools/integration/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Reference implementations for custom tool integrations used in documentation examples."""
82 changes: 82 additions & 0 deletions examples/tools/integration/local_temperature.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
"""Local-only tool example for converting temperatures between units."""

from __future__ import annotations

from typing import Any, Dict, List

from agents.tools.base import JustInTimeToolingBase, ToolBase


class TemperatureConversionTool(ToolBase):
"""Simple in-process tool that performs temperature conversions."""

TOOL_ID = "temperature.convert"

def __init__(self) -> None:
super().__init__(id=self.TOOL_ID)
self.name = "Temperature Converter"
self.description = (
"Convert temperature values between Celsius and Fahrenheit without leaving the runtime."
)
self._parameters: Dict[str, Any] = {
"value": {
"type": "number",
"description": "Temperature value to convert.",
},
"from_unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Unit of the provided temperature value.",
},
"to_unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Unit to convert the value into.",
},
}

def get_summary(self) -> str:
return f"{self.id}: {self.name} - {self.description}"

def get_details(self) -> str:
return (
"Temperature converter. Provide `value`, `from_unit`, and `to_unit` "
"to receive the converted number."
)

def get_parameters(self) -> Dict[str, Any]:
return self._parameters


class LocalTemperatureTools(JustInTimeToolingBase):
"""Tool provider that exposes the local temperature converter."""

def __init__(self) -> None:
self._tool = TemperatureConversionTool()

def search(self, query: str, *, top_k: int = 10) -> List[ToolBase]:
if "temperature" in query.lower() or "convert" in query.lower():
return [self._tool]
return []

def load(self, tool: ToolBase) -> ToolBase:
if tool.id != self._tool.id:
raise ValueError(f"Unknown tool requested: {tool.id}")
return self._tool

def execute(self, tool: ToolBase, parameters: Dict[str, Any]) -> Any:
if tool.id != self._tool.id:
raise ValueError(f"Unexpected tool invocation: {tool.id}")

value = float(parameters.get("value", 0.0))
from_unit = str(parameters.get("from_unit", "celsius")).lower()
to_unit = str(parameters.get("to_unit", "fahrenheit")).lower()

if from_unit == to_unit:
return value
if from_unit == "celsius" and to_unit == "fahrenheit":
return (value * 9.0 / 5.0) + 32.0
if from_unit == "fahrenheit" and to_unit == "celsius":
return (value - 32.0) * 5.0 / 9.0

raise ValueError("Unsupported unit conversion requested")
122 changes: 122 additions & 0 deletions examples/tools/integration/shell_command.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
"""Command execution tool example with a configurable allow-list."""

from __future__ import annotations

import shlex
from dataclasses import dataclass, field
from typing import Any, Dict, Iterable, List, Protocol, Sequence

import subprocess

from agents.tools.base import JustInTimeToolingBase, ToolBase


class _SubprocessRunner(Protocol):
"""Callable protocol mirroring `subprocess.run`."""

def __call__(
self,
args: Sequence[str],
*,
capture_output: bool,
text: bool,
timeout: float | None = None,
) -> subprocess.CompletedProcess[str]:
...


class ShellCommandTool(ToolBase):
"""Tool metadata for executing shell commands on an allow-list."""

TOOL_ID = "shell.run"

def __init__(self) -> None:
super().__init__(id=self.TOOL_ID)
self.name = "Shell Command"
self.description = "Execute a pre-approved shell command and capture the output."
self._parameters: Dict[str, Any] = {
"command": {
"type": "string",
"description": "Command name that must be present in the allow-list.",
},
"args": {
"type": "array",
"items": {"type": "string"},
"description": "Optional command arguments.",
},
"timeout": {
"type": "number",
"description": "Optional timeout in seconds (defaults to 10).",
},
}

def get_summary(self) -> str:
return f"{self.id}: {self.name} - {self.description}"

def get_details(self) -> str:
return (
"Runs commands using subprocess with capture_output=True. Only commands present in the allow-list "
"configured on the provider are permitted."
)

def get_parameters(self) -> Dict[str, Any]:
return self._parameters


@dataclass
class ShellCommandTools(JustInTimeToolingBase):
"""Provider that safely exposes shell commands to the agent."""

allow_list: Iterable[str] = field(default_factory=lambda: ("echo",))
runner: _SubprocessRunner = subprocess.run

def __post_init__(self) -> None:
self._tool = ShellCommandTool()
self._allowed = {cmd.strip() for cmd in self.allow_list}

def search(self, query: str, *, top_k: int = 10) -> List[ToolBase]:
if any(word in query.lower() for word in ("run", "shell", "command")):
return [self._tool]
return []

def load(self, tool: ToolBase) -> ToolBase:
if tool.id != self._tool.id:
raise ValueError(f"Unknown tool requested: {tool.id}")
return self._tool

def execute(self, tool: ToolBase, parameters: Dict[str, Any]) -> Dict[str, Any]:
if tool.id != self._tool.id:
raise ValueError(f"Unexpected tool invocation: {tool.id}")

command = str(parameters.get("command", "")).strip()
if command not in self._allowed:
raise ValueError(f"Command '{command}' is not permitted")

args = parameters.get("args", []) or []
if not isinstance(args, list):
raise ValueError("args must be an array of strings")
str_args = [str(item) for item in args]

timeout = parameters.get("timeout")
timeout_f = float(timeout) if timeout is not None else 10.0

completed = self.runner(
[command, *str_args],
capture_output=True,
text=True,
timeout=timeout_f,
)

return {
"command": command,
"args": str_args,
"returncode": completed.returncode,
"stdout": completed.stdout.strip(),
"stderr": completed.stderr.strip(),
}

@staticmethod
def format_command(command: str, args: Iterable[str]) -> str:
"""Utility helper to show the command in shell form for documentation snippets."""
parts = [command, *args]
return " ".join(shlex.quote(part) for part in parts)
Loading