-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vllm serve
to wrap vllm.entrypoints.openai.api_server
#4167
Conversation
@@ -26,6 +26,8 @@ | |||
|
|||
TIMEOUT_KEEP_ALIVE = 5 # seconds | |||
|
|||
engine: AsyncLLMEngine = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shoudln't this be optional?
|
||
if __name__ == "__main__": | ||
# NOTE(simon): | ||
# This section should be in sync with vllm/scripts.py for CLI entrypoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any way to add a simple regression test for this?
usage="vllm serve <model_tag> [options]") | ||
make_arg_parser(serve_parser) | ||
# Override the `--model` optional argument, make it positional. | ||
serve_parser.add_argument("model", type=str, help="The model tag to serve") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's happening if vllm serve --model ?
serve_parser.set_defaults(func=run_server) | ||
|
||
args = parser.parse_args() | ||
if hasattr(args, "func"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part of code is confusing. Add a comment to explain what this does?
@@ -95,8 +95,7 @@ template, or the template in string form. Without a chat template, the server wi | |||
and all chat requests will error. | |||
|
|||
```bash | |||
python -m vllm.entrypoints.openai.api_server \ | |||
--model ... \ | |||
vllm serve ... \ | |||
--chat-template ./path-to-chat-template.jinja | |||
``` | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on #4709, the :prog:
value under CLI args (line 111) should be updated to vllm serve
.
Is there any update on this? Having the command simplifies the installation of pipx install vllm
# now you can use the command
vllm serve --help with the wonderful pipx tool that manages virtual environments automatically. In the current state you cannot use |
Would be nice if #4794 is also made available via CLI (perhaps |
Please refer to #5090 for the complete new CLI. |
Easier to type. It will be now