Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] add benchmarking guide for lmi #1606

Merged
merged 1 commit into from
Mar 8, 2024

Conversation

siddvenk
Copy link
Contributor

@siddvenk siddvenk commented Mar 7, 2024

Description

Adds benchmarking guide for LMI using awscurl

@siddvenk siddvenk requested review from zachgk, frankfliu and a team as code owners March 7, 2024 20:11
To run a benchmark with 10 concurrent clients, each issuing 30 requests, with a single common payload, we can run:

```shell
TOKENIZER=<tokenizer_id> ./awscurl -c 10 -N 30 -X POST -n sagemaker <your full sagemaker endpoint> (e.g. https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/mms-demo/invocations) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better add example for benchmarking Time to first byte using invocations-response-stream endpoint on SM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add that example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added section on streaming usage (does depend on new awscurl being published to s3 to fix the issue)

@siddvenk siddvenk merged commit f244d8a into deepjavalibrary:master Mar 8, 2024
2 checks passed
@siddvenk siddvenk deleted the dg-benchmark branch March 8, 2024 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants