feat: multimodal support in `AmazonBedrockChatGenerator` #307

anakin87 · 2025-05-21T09:58:47Z

Related Issues

fixes Multimodal support in another ChatGenerator haystack#9261

Proposed Changes:

Add an experimental version of AmazonBedrockChatGenerator, which can handle user messages with text + images.

It only alters the _format_messages utility function
Reuses the existing implementation where possible

How did you test it?

Copied the existing tests; added new unit tests (for the utility function) + an integration test

Notes for the reviewer

Don't get scared by the size of this PR: the core logic is in haystack_experimental/components/generators/chat/bedrock.py and is about 150 lines.
Let's discuss integration tests: I copied all of them to show that everything works as before, but we can also agree to remove some of them (or test with fewer models) to save time and money. Let me know what you think...

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2025-05-21T10:06:46Z

Pull Request Test Coverage Report for Build 15203979000

Details

0 of 0 changed or added relevant lines in 0 files are covered.
2 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.4%) to 73.708%

Files with Coverage Reduction	New Missed Lines	%
components/generators/chat/init.py	2	75.0%

Totals
Change from base Build 15203742899:	0.4%
Covered Lines:	1326
Relevant Lines:	1799

💛 - Coveralls

haystack_experimental/components/generators/chat/bedrock.py

sjrl · 2025-05-22T16:06:48Z

I think it'd be good to be consistent with naming and change the name of the file from bedrock.py to amazon_bedrock.py but also fine if you'd rather leave alone.

haystack_experimental/components/generators/chat/bedrock.py

sjrl · 2025-05-22T16:12:20Z

haystack_experimental/components/generators/chat/bedrock.py

+            elif msg.tool_calls:
+                bedrock_formatted_messages.append(_format_tool_call_message(msg))
+            elif msg.tool_call_results:
+                bedrock_formatted_messages.append(_format_tool_result_message(msg))


I think out of scope in this PR, but do you happen to know if bedrock supports images in the tool result message? Thinking again on the request of a Tool being able to return an image.

Surprisingly yes: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolResultContentBlock.html

This field is only supported by Anthropic Claude 3 models.

~~This is a bit strange because Claude should not allow images in tool results according to their API.~~ I was wrong.

Anyway, I want to see if this works with the Converse API. I'll keep you posted.

#307 (comment)

sjrl

Looks good! Just a few minor comments.

Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>

…ystack-experimental into bedrock-multimodal

anakin87 · 2025-05-23T06:36:15Z

I think it'd be good to be consistent with naming and change the name of the file from bedrock.py to amazon_bedrock.py but also fine if you'd rather leave alone.

done

anakin87 · 2025-05-23T09:37:30Z

Converse API - Tool result with an image

import os
from haystack_integrations.common.amazon_bedrock.utils import get_aws_session


session = get_aws_session(
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    aws_session_token=os.getenv("AWS_SESSION_TOKEN"),
    aws_region_name="us-east-1",
    aws_profile_name=os.getenv("AWS_PROFILE"),
)

client = session.client("bedrock-runtime")

image_bytes = open("test/test_files/images/apple.jpg", "rb").read()

messages = [
    {
        "role": "user",
        "content": [{"text": "Download the image at this url and describe it in max 5 words. URL: www.example.com/image.png"}]
    },
    {
        "role": "assistant",
        "content": [
            {"text": "I need to use the download tool."},
            {"toolUse": {"toolUseId": "tooluse_a2XtsIwsRse-gKI8YkFyfQ", "name": "download", "input": {"url": "www.example.com/image.png"}}}
        ]
    },
    {
        "role": "user",
        "content": [{
            "toolResult": {
                "toolUseId": "tooluse_a2XtsIwsRse-gKI8YkFyfQ",
                "content": [{"image": {"format": "png", "source": {"bytes": image_bytes}}}]
            }
        }]
    }
]

toolConfig = {
    "tools": [{
        "toolSpec": {
            "name": "download",
            "description": "Download an image from a URL",
            "inputSchema": {"json": {"type": "object", "properties": {"url": {"type": "string"}}}}
        }
    }]
}

response = client.converse(
    modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=messages,
    toolConfig=toolConfig
)

print(response)
# {
#     'ResponseMetadata': {...},
#     'output': {
#         'message': {
#             'role': 'assistant',
#             'content': [{'text': "Here's a description of the image in 5 words:\n\nRipe apple on straw background."}]
#         }
#     },
#     'stopReason': 'end_turn',
#     'usage': {'inputTokens': 951, 'outputTokens': 27, 'totalTokens': 978},
#     'metrics': {'latencyMs': 1247}
# }

Even if the tool call is simulated, this works.

Supporting this feature out of the box would mean changing our ToolCallResult dataclass to include ImageContent and I would postpone this to the time when more model providers support this use case. OpenAI, Gemini, Ollama etc. do not allow this.

Basically, Anthropic/Bedrock allow this use case because the tool message is a user message.

I opened an issue to track this idea: deepset-ai/haystack#9432.

anakin87 added 2 commits May 21, 2025 11:58

start Bedrock setup

d71a15f

fix unit test

b35c8e5

anakin87 added 6 commits May 22, 2025 08:05

try wo role

b0ecc40

try adding permissions

7d739ae

Trigger CI

546fb71

align tests with core integrations tests

dd43709

main implementation

4c25f81

fix monkeypatch + new tests

77388b4

anakin87 mentioned this pull request May 22, 2025

Bedrock multimodal experiment deepset-ai/haystack-core-integrations#1770

Closed

update pydoc config

d670042

anakin87 changed the title ~~feat: multimodal Bedrock [WIP]~~ feat: multimodal support in AmazonBedrockChatGenerator May 22, 2025

anakin87 marked this pull request as ready for review May 22, 2025 09:59

anakin87 requested a review from a team as a code owner May 22, 2025 09:59

anakin87 requested review from sjrl and removed request for a team May 22, 2025 09:59

sjrl reviewed May 22, 2025

View reviewed changes

haystack_experimental/components/generators/chat/bedrock.py Outdated Show resolved Hide resolved

sjrl reviewed May 22, 2025

View reviewed changes

haystack_experimental/components/generators/chat/bedrock.py Outdated Show resolved Hide resolved

sjrl reviewed May 22, 2025

View reviewed changes

sjrl approved these changes May 23, 2025

View reviewed changes

anakin87 and others added 4 commits May 23, 2025 08:25

Merge branch 'main' into bedrock-multimodal

3fe26c7

Update haystack_experimental/components/generators/chat/bedrock.py

e87f92c

Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>

Merge branch 'bedrock-multimodal' of https://github.com/deepset-ai/ha…

8e7b331

…ystack-experimental into bedrock-multimodal

move modules + comment on image formats

b1ed482

fix pydoc config

0083a35

anakin87 mentioned this pull request May 23, 2025

Investigate support for images in ToolCallResult deepset-ai/haystack#9432

Open

anakin87 merged commit f379e45 into main May 23, 2025
10 checks passed

anakin87 deleted the bedrock-multimodal branch May 23, 2025 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: multimodal support in `AmazonBedrockChatGenerator` #307

feat: multimodal support in `AmazonBedrockChatGenerator` #307

Uh oh!

anakin87 commented May 21, 2025 •

edited

Loading

Uh oh!

coveralls commented May 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

sjrl commented May 22, 2025

Uh oh!

Uh oh!

sjrl May 22, 2025

Uh oh!

anakin87 May 23, 2025 •

edited

Loading

Uh oh!

anakin87 May 23, 2025

Uh oh!

sjrl left a comment

Uh oh!

anakin87 commented May 23, 2025

Uh oh!

anakin87 commented May 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

feat: multimodal support in AmazonBedrockChatGenerator #307

feat: multimodal support in AmazonBedrockChatGenerator #307

Uh oh!

Conversation

anakin87 commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

coveralls commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 15203979000

Details

💛 - Coveralls

Uh oh!

Uh oh!

sjrl commented May 22, 2025

Uh oh!

Uh oh!

sjrl May 22, 2025

Choose a reason for hiding this comment

Uh oh!

anakin87 May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anakin87 May 23, 2025

Choose a reason for hiding this comment

Uh oh!

sjrl left a comment

Choose a reason for hiding this comment

Uh oh!

anakin87 commented May 23, 2025

Uh oh!

anakin87 commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Converse API - Tool result with an image

Uh oh!

Uh oh!

Uh oh!

feat: multimodal support in `AmazonBedrockChatGenerator` #307

feat: multimodal support in `AmazonBedrockChatGenerator` #307

anakin87 commented May 21, 2025 •

edited

Loading

coveralls commented May 21, 2025 •

edited

Loading

anakin87 May 23, 2025 •

edited

Loading

anakin87 commented May 23, 2025 •

edited

Loading