Python: Bug: Cannot stream with 3.7 sonnet

**Describe the bug**
In order to use the latest claude model, I need to use an inference profile as a model id: https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-use.html

This works with the converse operation, but does not seem to work with https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetFoundationModel.html

So right now, if I use an inference profile as a model ID and do streaming, I get this error:
```
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the GetFoundationModel operation: The provided model identifier is invalid.
```

But if I use an inference profile id, I get this:
```
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the Converse operation: Invocation of model ID anthropic.claude-3-7-sonnet-20250219-v1:0 with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile that contains this model.
```
It seems like the latest claude model only supports inference profiles.
So as a result I see no way to run the latest sonnet model with streaming, or am I missing something?

**To Reproduce**
I can reproduce the issue with this script.
If I swap the MODEL_ID with the INFERENCE_PROFILE_ID, the error switches between the two errors I posted above.
``` python
import asyncio

import boto3
from django.conf import settings
from semantic_kernel.connectors.ai.anthropic import (
    AnthropicChatPromptExecutionSettings,
)
from semantic_kernel.connectors.ai.bedrock import BedrockChatCompletion
from semantic_kernel.contents import (
    ChatHistory,
    ChatMessageContent,
    AuthorRole,
    TextContent,
)

AWS_AI_REGION = "us-east-1"
MODEL_ID = "anthropic.claude-3-7-sonnet-20250219-v1:0"
INFERENCE_PROFILE_ID = "us.anthropic.claude-3-7-sonnet-20250219-v1:0"

bedrock_client = boto3.client(
    "bedrock",
    region_name=AWS_AI_REGION,
    aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
    aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
)
bedrock_runtime_client = boto3.client(
    "bedrock-runtime",
    region_name=settings.AWS_AI_REGION,
    aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
    aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
)


async def main() -> None:
    sk_client = BedrockChatCompletion(
        model_id=INFERENCE_PROFILE_ID,
        client=bedrock_client,
        runtime_client=bedrock_runtime_client,
    )
    llm_settings = AnthropicChatPromptExecutionSettings(
        temperature=0.2,
    )
    history = ChatHistory(
        messages=[
            ChatMessageContent(role=AuthorRole.USER, items=[TextContent(text="hi")])
        ]
    )
    async for item in sk_client.get_streaming_chat_message_contents(
        history, llm_settings
    ):
        print(item)


if __name__ == "__main__":
    asyncio.run(main())

```

**Expected behavior**
The script should stream messages


**Platform**
 - Language: Python
 - Source: semantic-kernel==1.24.0
 - AI model: [e.g. OpenAI:GPT-4o-mini(2024-07-18)]
 - OS: Mac

**Note**
My understanding of AWS is not that deep, I hope what I wrote there is correct and makes sense

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Bug: Cannot stream with 3.7 sonnet #10941

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python: Bug: Cannot stream with 3.7 sonnet #10941

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions