-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the limitation of multiple servers binding to the same http/grp… #4991
Conversation
Are we sure this is the best place for this documentation? I think users usually check our documentation (e.g. .MD files) rather than the command line arguments help screen. Especially since it's quite long. If we want users to know about and understand these options, it might make sense to also include them in documentation. |
@dyastremsky I might be wrong but I couldn't find any documentation regarding anything about http/grpc port except in |
src/main.cc
Outdated
@@ -401,7 +401,9 @@ std::vector<Option> options_ | |||
"The port for the server to listen on for HTTP requests."}, | |||
{OPTION_REUSE_HTTP_PORT, "reuse-http-port", Option::ArgBool, | |||
"Allow multiple servers to listen on the same HTTP port when every " | |||
"server has this option set."}, | |||
"server has this option set. The same set of models/same model " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If you plan to use this option as a way to load-balance between different triton servers, the same model repository or set of models must be used for every server."
Note that this feature only supports stateless models.
I think it might be better to remove this sentence since the customers may figure out a way to control which server gets the requests and address this limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Updated the document, thanks for the comment!
I see, okay. Sounds like we're being consistent here then. |
src/main.cc
Outdated
@@ -401,7 +401,9 @@ std::vector<Option> options_ | |||
"The port for the server to listen on for HTTP requests."}, | |||
{OPTION_REUSE_HTTP_PORT, "reuse-http-port", Option::ArgBool, | |||
"Allow multiple servers to listen on the same HTTP port when every " | |||
"server has this option set."}, | |||
"server has this option set. If you plan to use this option as a way to " | |||
"load-balance between different triton servers, the same model " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below:
- Triton should be capitalized.
- Load balance should not have a dash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch! Updated.
…c port