Python: [Feature]: Allow @tool functions to return image content that the model can analyze

## Description

Currently, `@tool`-decorated functions are text-in / text-out by design. When a tool returns a `Content` object (e.g. `Content.from_data(image_bytes, "image/png")`), `FunctionTool.parse_result()` serializes it to a JSON string via `_make_dumpable()` and stores it as plain text in `Content.from_function_result(result=...)`. When sent to the API, it becomes `{"type": "function_call_output", "output": "<json string>"}` — so the model sees a text description of the image, not the actual visual content.

This creates a gap for agentic use-cases where a tool needs to capture a screenshot, render a chart, fetch an image from an external service, or generate a diagram, and the model should then be able to visually analyze or describe that image as part of the same reasoning loop — without requiring the developer to manually wire up a multi-turn workaround.

### What problem does it solve?

- Enables vision-in-the-loop workflows without manual multi-turn orchestration
- Allows tools like `capture_screenshot()`, `render_chart()`, or `fetch_image()` to feed image content back into the model natively
- Unblocks agentic computer-use and data-visualization scenarios where the image is produced dynamically at runtime (not known ahead of time as a user message)

### Expected behavior

When a `@tool` function returns a `Content` object with an image media type (or a `list[Content]` containing image content), the framework should forward that content as a visual input item to the next model call — not as a serialized JSON string — so the model can perceive and reason about the image.

Possible implementation approaches:
1. A dedicated content type (e.g. `"function_result_with_content"`) that carries structured `Content` items alongside the text result, allowing provider-specific serialization to emit both `function_call_output` and `input_image` items
2. A new `@tool` option (e.g. `return_content=True`) that opts the function into rich-content return handling
3. Middleware/hook support that intercepts tool results containing image `Content` and injects them as user-turn vision messages automatically

### Alternatives considered

- **Multi-turn workaround**: Tool returns image bytes → developer extracts them from the `function_result` → injects into the next `Message` as `Content.from_data(...)`. Works but requires boilerplate and breaks the natural tool abstraction.
- **MCP tools**: MCP server tools that return `ImageContent` go through `_parse_tool_result_from_mcp` which also serializes to JSON text — same limitation.
- **Provider-managed tools** (`ImageGenTool`, code interpreter): These work but only cover built-in provider capabilities, not custom user-defined functions.

## Code Sample

**Current workaround (manual multi-turn):**
```python
@tool(approval_mode="never_require")
async def capture_screenshot(url: Annotated[str, Field(description="URL to screenshot")]) -> str:
    image_bytes = await take_screenshot(url)
    # Must return text; store image out-of-band
    store_image("last_screenshot", image_bytes)
    return "Screenshot captured."

# Then manually inject the image into the next turn:
response = await client.get_response(messages, tools=capture_screenshot)
image_bytes = retrieve_image("last_screenshot")
messages.append(Message(role="user", contents=[
    Content.from_text("Now analyze this screenshot:"),
    Content.from_data(data=image_bytes, media_type="image/png"),
]))
response2 = await client.get_response(messages)
```

**Desired (proposed) API:**
```python
@tool(approval_mode="never_require")
async def capture_screenshot(url: Annotated[str, Field(description="URL to screenshot")]) -> Content:
    image_bytes = await take_screenshot(url)
    return Content.from_data(data=image_bytes, media_type="image/png")

# Framework automatically forwards the image content to the next model call
response = await client.get_response(messages, tools=capture_screenshot)
print(response.text)  # Model's description/analysis of the screenshot
```

## Language/SDK

Both

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: [Feature]: Allow @tool functions to return image content that the model can analyze #4272

Description

What problem does it solve?

Expected behavior

Alternatives considered

Code Sample

Language/SDK

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python: [Feature]: Allow @tool functions to return image content that the model can analyze #4272

Description

Description

What problem does it solve?

Expected behavior

Alternatives considered

Code Sample

Language/SDK

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions