-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script for benchmarking serving throughput #145
Merged
Merged
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
473c5b8
Minor fix
WoosukKwon a644a9b
Minor
WoosukKwon 67ed51c
Minor
WoosukKwon 83acd5e
Minor
WoosukKwon 4957281
Add log-requests option to AsyncLLMServer
WoosukKwon c6b38d2
[WIP] Add benchmark_serving.py
WoosukKwon 5210de0
Minor
WoosukKwon d4df348
Delete unused files
WoosukKwon fab12d6
Minor
WoosukKwon 3ddadf4
Add docstring
WoosukKwon 4269b11
Bugfix
WoosukKwon af8974d
Minor
WoosukKwon f8dee6e
Minor
WoosukKwon d181f10
Add script to launch HF server
WoosukKwon fc02a02
Add HF backend
WoosukKwon 99d9ce3
Minor
WoosukKwon bc9ec63
Bugfix
WoosukKwon 9477f2f
Filter out long prompts
WoosukKwon 51a5332
Minor fix
WoosukKwon 6b0d77b
Merge branch 'main' into benchmark-llama
WoosukKwon 00d158d
Repeat failed requests
WoosukKwon 0c55c40
Stream=False
WoosukKwon bcb8e16
Minor
WoosukKwon 6a7baaa
Prune short sequences
WoosukKwon 071b4aa
Add 1 hour timeout
WoosukKwon 983cf97
Increase timeout
WoosukKwon b55b1ee
Add shortcut
WoosukKwon c45a2dd
Simplify
WoosukKwon 66f8c60
Merge branch 'opt' into benchmark-llama
WoosukKwon a1b513e
n -> best_of
WoosukKwon 72d6a63
Minor
WoosukKwon 44bc461
Add latency stats
WoosukKwon 6990fc5
Increase max_best_of in HF server
WoosukKwon 2c610bd
Merge branch 'main' into benchmark-llama
WoosukKwon 5687f10
hf -> tgi
WoosukKwon 672fbbd
Add HF backend
WoosukKwon 60bccc4
Fix batching
WoosukKwon b7fcade
Fix a bug & Add tqdm
WoosukKwon 6accbfd
Minor
WoosukKwon c7360d1
Fix
WoosukKwon bf1bae6
Comment
WoosukKwon 7bebe29
Add docstring
WoosukKwon 5c1b852
Comment
WoosukKwon File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Add docstring
- Loading branch information
commit 3ddadf47d81091af636a57db1dc06a4d6d4bf57c
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this argument actually ask for
model_name
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, is it confusing?