-
-
Couldn't load subscription status.
- Fork 10.8k
[benchmarks]allow skip ready check for bench serve #25420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@luccafong has exported this pull request. If you are a Meta employee, you can view the originating diff in D82995002. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a --skip-ready-check flag for the bench serve command, allowing users to bypass the initial endpoint readiness check. The implementation is clear and correctly adds the new command-line argument and conditional logic. I've added one comment regarding user-facing log messages to improve clarity when the check is skipped. Additionally, while the change is functionally correct, it would benefit from corresponding test cases to verify the new flag's behavior and prevent future regressions.
4ddee01 to
9f492c3
Compare
|
we can achieve same purpose by setting --ready-check-timeout-sec=0? |
I think it will raise the error directly |
4a65536 to
a81ce95
Compare
Summary: Allow skip ready check for bench serve through `--skip-ready-check` Test Plan: `vllm bench serve --skip-ready-check` Differential Revision: D82995002 Signed-off-by: Lu Fang <fanglu@fb.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Signed-off-by: Lu Fang <fanglu@fb.com>
a81ce95 to
b5a1ed2
Compare
Signed-off-by: Lu Fang <fanglu@fb.com>
b5a1ed2 to
f52445d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case for this? I actually found the wait_for_endpooint util pretty handy (now I don't need to fire vllm serve and vllm bench serve command sequentially)
when the workload is huge, and we don't want to wait for 1 request to completed. |
Ah ok - then I think it's probably better to modify I also don't have a strong opinion on this current PR so going to approve it. |
agree that pinging |
@minosfuture Feel free to make a follow-up PR! |
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com> Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com> Signed-off-by: gaojc <1055866782@qq.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
Signed-off-by: Lu Fang <fanglu@fb.com> Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Summary: Allow skip ready check for bench serve through
--ready-check-timeout-sec 0Test Plan:
vllm bench serve --ready-check-timeout-sec 0Differential Revision: D82995002