Amazon SageMaker AI Model Provider implementation #30

dgallitelli · 2025-05-17T03:47:10Z

Description

Support for Amazon SageMaker AI endpoints as Model Provider

Related Issues

PR #16

Documentation PR

[Link to related associated PR in the agent-docs repo]

Type of Change

New feature

Testing

Yes

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

src/strands/models/sagemaker.py

tests-integ/test_model_sagemaker.py

src/strands/models/sagemaker.py

pgrayy · 2025-05-20T23:03:27Z

Here is a PR for the OpenAI model provider. It also introduces a base class from which we can derive other OpenAI compatible providers. If we ask that customers host their models on SageMaker with OpenAI compatibility, then we can reduce a lot of code duplication (see for example how LiteLLMModel was transformed in the PR).

dgallitelli · 2025-05-22T14:24:31Z

Here is a PR for the OpenAI model provider. It also introduces a base class from which we can derive other OpenAI compatible providers. If we ask that customers host their models on SageMaker with OpenAI compatibility, then we can reduce a lot of code duplication (see for example how LiteLLMModel was transformed in the PR).

This is very interesting @pgrayy , thank you for bringing it to my attention. What do you think should we do here? Do you want to merge you OpenAI PR first, then I build a class derived from the OpenAIModel one? Or should I just go ahead and implementing as I'm doing now?

dgallitelli · 2025-05-22T17:58:23Z

Updated implementation based on the suggestions in this PR. Please re-review and let me know if you need me to change more code. I've also run the tests and everything seems to work just fine :)

pgrayy · 2025-05-22T19:11:42Z

Here is a PR for the OpenAI model provider. It also introduces a base class from which we can derive other OpenAI compatible providers. If we ask that customers host their models on SageMaker with OpenAI compatibility, then we can reduce a lot of code duplication (see for example how LiteLLMModel was transformed in the PR).

This is very interesting @pgrayy , thank you for bringing it to my attention. What do you think should we do here? Do you want to merge you OpenAI PR first, then I build a class derived from the OpenAIModel one? Or should I just go ahead and implementing as I'm doing now?

We now have the OpenAI provider PR merged. I think it would be worth updating the SageMaker provider to use the new base class. The code should look similar to:

class SageMakerModel(OpenAIModel):

    class SageMakerConfig(TypedDict, total=False):
        endpoint_name: str
        inference_component_name: Optional[str]
        model_id: str
        params: Optional[dict[str, Any]]

    def __init__(
        self,
        boto_session: Optional[boto3.Session] = None,
        boto_client_config: Optional[BotocoreConfig] = None,
        region_name: Optional[str] = None,
        **model_config: Unpack[SageMakerConfig]
     ) -> None:
        self.config = dict(model_config)

        logger.debug("config=<%s> | initializing", self.config)

        boto_session = boto_session or boto3.Session(region_name=region_name)
        self.client = session.client(
            service_name="sagemaker-runtime",
            config=boto_client_config,
        )

    @override
    def update_config(self, **model_config: Unpack[SageMakerConfig]) -> None:
        self.config.update(model_config)

    @override
    def get_config(self) -> SageMakerConfig:
        return cast(SageMakerModel.SageMakerConfig, self.config)

    @override
    def stream(self, request: dict[str, Any]) -> Iterable[dict[str, Any]]:
        # yield events in the format that the base OpenAIModel.format_chunk expects.

dgallitelli · 2025-05-22T21:01:49Z

We now have the OpenAI provider PR merged. I think it would be worth updating the SageMaker provider to use the new base class.

Thanks @pgrayy . I've done some tests, and it does not seem like obtaining a response from SageMaker AI in completions format is very straightforward. I'd suggest we go ahead with the current implementation, and I will work with the SageMaker AI Service Team to figure out what is the best approach to improve the implementation and support OpenAI Completions as response format from the endpoint.

pgrayy · 2025-05-23T13:54:13Z

We now have the OpenAI provider PR merged. I think it would be worth updating the SageMaker provider to use the new base class.

Thanks @pgrayy . I've done some tests, and it does not seem like obtaining a response from SageMaker AI in completions format is very straightforward. I'd suggest we go ahead with the current implementation, and I will work with the SageMaker AI Service Team to figure out what is the best approach to improve the implementation and support OpenAI Completions as response format from the endpoint.

Can you elaborate on this? SageMaker is a custom model hosting solution so you should be able to conform to any format. What challenges are you seeing implementing your handler to return OpenAI compatible payloads? I can help do some testing on my end.

dgallitelli · 2025-05-23T15:36:29Z

If you could test, it would be great! The problem is how the streamed response comes back from the DJL container.

dgallitelli · 2025-06-04T13:39:36Z

Closed this Pull Request to work on a new implementation based on OpenAI model provider. Refer to #176 .

dgallitelli requested a review from a team as a code owner May 17, 2025 03:47

cagataycali reviewed May 17, 2025

View reviewed changes

src/strands/models/sagemaker.py Outdated Show resolved Hide resolved

dgallitelli requested a review from cagataycali May 19, 2025 07:57

pgrayy requested changes May 19, 2025

View reviewed changes

src/strands/models/sagemaker.py Outdated Show resolved Hide resolved

pgrayy reviewed May 19, 2025

View reviewed changes

src/strands/models/sagemaker.py Outdated Show resolved Hide resolved

src/strands/models/sagemaker.py Outdated Show resolved Hide resolved

pgrayy reviewed May 19, 2025

View reviewed changes

src/strands/models/sagemaker.py Outdated Show resolved Hide resolved

pgrayy reviewed May 19, 2025

View reviewed changes

src/strands/models/sagemaker.py Outdated Show resolved Hide resolved

awsarron assigned pgrayy May 21, 2025

pgrayy mentioned this pull request May 21, 2025

[FEATURE] Support for Amazon SageMaker AI endpoints as Model Provider #16

Open

dgallitelli requested a review from pgrayy May 22, 2025 17:57

yonib05 linked an issue May 28, 2025 that may be closed by this pull request

[FEATURE] Support for Amazon SageMaker AI endpoints as Model Provider #16

Open

dgallitelli closed this Jun 4, 2025

dgallitelli force-pushed the main branch from c08f3fd to ffc7c5e Compare June 4, 2025 13:20

dgallitelli mentioned this pull request Jun 4, 2025

Support for Amazon SageMaker AI endpoints as Model Provider #176

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Amazon SageMaker AI Model Provider implementation #30

Amazon SageMaker AI Model Provider implementation #30

Uh oh!

dgallitelli commented May 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pgrayy commented May 20, 2025

Uh oh!

dgallitelli commented May 22, 2025

Uh oh!

dgallitelli commented May 22, 2025

Uh oh!

pgrayy commented May 22, 2025 •

edited

Loading

Uh oh!

dgallitelli commented May 22, 2025

Uh oh!

pgrayy commented May 23, 2025

Uh oh!

dgallitelli commented May 23, 2025

Uh oh!

dgallitelli commented Jun 4, 2025

Uh oh!

Uh oh!

Amazon SageMaker AI Model Provider implementation #30

Amazon SageMaker AI Model Provider implementation #30

Uh oh!

Conversation

dgallitelli commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pgrayy commented May 20, 2025

Uh oh!

dgallitelli commented May 22, 2025

Uh oh!

dgallitelli commented May 22, 2025

Uh oh!

pgrayy commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgallitelli commented May 22, 2025

Uh oh!

pgrayy commented May 23, 2025

Uh oh!

dgallitelli commented May 23, 2025

Uh oh!

dgallitelli commented Jun 4, 2025

Uh oh!

Uh oh!

dgallitelli commented May 17, 2025 •

edited

Loading

pgrayy commented May 22, 2025 •

edited

Loading