-
Notifications
You must be signed in to change notification settings - Fork 6.9k
[WIP][serve] Fail on the change of 'proxy_location' or 'http_options' parameters for the 'serve' API #56507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
e1fb664
to
e12a4ac
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces valuable validation for immutable configuration parameters (proxy_location
and http_options
) within the serve deploy
command. By failing fast with a clear error message, it significantly improves the user experience over the previous behavior of silently ignoring changes. The implementation is straightforward, and the new logic is accompanied by a solid set of unit tests. I have a couple of suggestions to enhance the code's conciseness and test coverage.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces validation to prevent runtime updates to immutable Serve configurations like proxy_location
and http_options
via the serve deploy
CLI. This is a great improvement as it provides explicit and fast feedback to the user. The new validation logic in _validate_deployment_config
is clear and correct. The accompanying tests are comprehensive, covering various scenarios including successful updates, and failures for both proxy_location
and http_options
changes. I've added a couple of minor suggestions to improve the readability of the error messages.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important validation for proxy_location
and http_options
in the serve deploy
command, preventing runtime changes to these immutable configurations. The implementation is solid, with a new validation function that fails fast and provides clear error messages. The changes are well-tested with both unit and integration-style tests. I have one suggestion to improve the test code's maintainability by marking helper methods as static.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an important validation step in the serve deploy
command to prevent runtime updates to immutable global configurations like proxy_location
and http_options
. By raising an exception, it provides clear, immediate feedback to the user, which is a significant improvement over the previous silent failure. The implementation is well-structured, and the accompanying tests are thorough. I have one suggestion to improve the consistency of the validation logic for proxy_location
to better align with the handling of http_options
, making the behavior more intuitive for users.
7f70e0c
to
f740837
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important validation to the serve deploy
command, preventing users from attempting to update immutable global configurations like proxy_location
and http_options
at runtime. Instead of silently ignoring these changes, Serve will now raise a RayServeException
with a clear error message, which significantly improves the user experience and reduces confusion. The implementation is solid, with a new validation function that correctly compares the new configuration against the running one. The accompanying documentation update and comprehensive unit tests are also well-executed. I have a minor suggestion to improve consistency in an error message.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important validation to prevent runtime updates of immutable Serve configurations like proxy_location
and http_options
via serve deploy
. The changes provide clear, immediate feedback to the user by raising an exception, which is a significant improvement over the previous silent failure. The implementation is solid, and the addition of unit tests for the new validation logic is great. I've suggested a couple of improvements to the tests to ensure full coverage for the new http_options
validation at the command level. Overall, this is a valuable change that will improve user experience and prevent confusion.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an important validation step in the serve deploy
command. It prevents users from attempting to update immutable global configurations like proxy_location
and http_options
at runtime, which previously failed silently. Instead, it now raises a RayServeException
with a clear error message. The changes include a new validation function, its integration into the deployment workflow, and comprehensive unit tests. Additionally, the documentation has been updated to reflect this limitation. My review focuses on improving the implementation of the validation logic for better code consistency.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important validation to prevent runtime updates of global Serve configuration parameters like proxy_location
and http_options
via serve deploy
. The change from silently ignoring these updates to raising a RayServeException
is a significant improvement for user experience, providing clear and immediate feedback. The accompanying documentation update and comprehensive test suite are well-executed. I've included one minor suggestion to refactor the validation logic for better consistency and efficiency.
372981f
to
43e7c42
Compare
Hi @abrarsheikh! |
python/ray/serve/scripts.py
Outdated
submission_client = ServeSubmissionClient(address) | ||
serve_details = ServeInstanceDetails(**submission_client.get_serve_details()) | ||
_validate_deployment_config(serve_details, config) | ||
|
||
submission_client.deploy_applications( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we make change at this layer, then still PUT API call will be ignoring the warning errors(this change works for serve deploy only), shouldn't we put this validation behind the API? as soon as we intercept the PUT request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now there are 2 ways of updating deployment. serve deploy cli command and PUT "/api/serve/applications/" REST API (both ways eventually go to serve_head) where warning is logged in case of changes in http_options
(which includes proxy_location
there). This warning is logged to /tmp/ray/session_latest/logs/dashboard_ServeHead.log
.
Do you think I need to replace that warning with error? Could you please let me know what would be the right place to add this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we make change at this layer, then still PUT API call will be ignoring the warning errors(this change works for serve deploy only), shouldn't we put this validation behind the API? as soon as we intercept the PUT request
@harshit-anyscale , do you think I can add the check to the controller after this line to compare the deploying config
with the curr_config
from checkpoint ? Then, we'll fail for both cli and rest APIs. Will it be the proper place for this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @harshit-anyscale! I checked that and looks like I can introduce this check in 2 places:
ray/python/ray/dashboard/modules/serve/serve_head.py
. I can fail here instead of warning. Maybe it makes more sense, but it will be in thedashboard
module.python/ray/serve/_private/controller.py
or I can add the check here inapply_config
incontroller
as I mentioned in my previous comment.
I prefer the option 1 as it fails earlier, but not 100% sure if thedashboard
module is the proper place to control it.
Please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first option seems appropriate, as we are already performing other validations there
@abrarsheikh let us know if you think otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Harshit! I ended up changing warning to error in the serve._private.api
on the serve
side. I'll resolve conflicts, fix build, and ask for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @harshit-anyscale , @abrarsheikh !
It turns out that it's a bit trickier than I thought. There are 3 places where users can set/update http_options
now:
- Python API
- CLI
- REST API
Python API:
serve:
- start -> _private_api.serve_start
- run -> _private_api.serve_start
Command Line Interface (CLI)
serve start -> serve.start
serve deploy -> REST API PUT "/api/serve/applications/"
serve run -> _private_api.serve_start
Serve REST API
PUT "/api/serve/applications/" -> _private_api.serve_start_async
All of those methods end up in the _private_api.serve_start(s)
methods and I believe the _check_http_options(client, http_options)
function is the correct place to introduce this change.
The problem now is that, it's challenging to distinguish what http_options
were entered by users and what not (across of all of the APIs). I'll try to refactor the API a bit to be able to do that and verify the difference properly.
But while I was looking into it, I identified a bug and a discrepancy/bug which relate to proxy_location
parameter. It would be good to resolve them first to reduce the complexity of this change. I opened a separate PR #56993 for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix of the bug and discrepancy mentioned in the previous comment is ready for review in the #57622.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important validation to prevent runtime updates of proxy_location
and http_options
via serve deploy
, which addresses a source of user confusion. The implementation is solid, with a new validation function that provides clear, actionable error messages. The accompanying tests are thorough, and the documentation update is a welcome addition. I have a few suggestions to enhance code readability and improve the clarity of the tests.
Hi @harshit-anyscale , I addressed comments and asked a question. Could you kindly take a look? |
85457ed
to
5edc8a3
Compare
… parameters for the 'serve deploy' command Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
f684b13
to
e8f9525
Compare
…ameters for the 'serve' API Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
e8f9525
to
3fa1944
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a great improvement for user experience. It changes the behavior of Ray Serve to fail explicitly with a RayServeConfigException
when a user attempts to modify global configurations like proxy_location
or http_options
at runtime, instead of silently ignoring the changes. This prevents confusion and makes the system's behavior more predictable. The changes are well-implemented across the API, dashboard, and CLI, and are accompanied by corresponding documentation and test updates. My review includes a suggestion to further improve the readability of the new exception's error message and a minor cleanup in one of the new tests.
python/ray/serve/_private/api.py
Outdated
raise RayServeConfigException( | ||
"Attempt to update `http_options` or `proxy_location` has been detected! " | ||
f"Attempted updates: {diff_http_options}. " | ||
"HTTP config is global to your Ray cluster, and you can't update it during runtime. " | ||
"Please restart Ray Serve to apply the change." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current error message for RayServeConfigException
includes the repr()
of ProxyLocation
enum, which is a bit verbose (e.g., <ProxyLocation.EveryNode: 'EveryNode'>
). We can format the dictionary of differences to use the enum's string value for a cleaner and more readable error message. This will also require updating the corresponding test assertion in python/ray/serve/tests/test_cli.py
.
from enum import Enum
# Format the diff for a cleaner error message.
# Enums are converted to their string values.
formatted_diff = {}
for field, diff in diff_http_options.items():
prev = diff["previous"]
new = diff["new"]
formatted_diff[field] = {
"previous": prev.value if isinstance(prev, Enum) else prev,
"new": new.value if isinstance(new, Enum) else new,
}
raise RayServeConfigException(
"Attempt to update `http_options` or `proxy_location` has been detected! "
f"Attempted updates: {formatted_diff}. "
"HTTP config is global to your Ray cluster, and you can't update it during runtime. "
"Please restart Ray Serve to apply the change."
)
python/ray/serve/tests/test_cli.py
Outdated
assert ( | ||
"Attempted updates: {" | ||
"'host': {'previous': '0.0.0.0', 'new': '0.0.0.1'}, " | ||
"'port': {'previous': 8000, 'new': 8001}, " | ||
"'location': {'previous': <ProxyLocation.EveryNode: 'EveryNode'>, 'new': <ProxyLocation.HeadOnly: 'HeadOnly'>}}" | ||
) in error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion checks for the specific format of the error message. To accompany the suggested improvement to the error message format in _private/api.py
, this test assertion should be updated to match the new, more readable format that uses string values for enums.
assert ( | |
"Attempted updates: {" | |
"'host': {'previous': '0.0.0.0', 'new': '0.0.0.1'}, " | |
"'port': {'previous': 8000, 'new': 8001}, " | |
"'location': {'previous': <ProxyLocation.EveryNode: 'EveryNode'>, 'new': <ProxyLocation.HeadOnly: 'HeadOnly'>}}" | |
) in error | |
assert ( | |
"Attempted updates: {" | |
"'host': {'previous': '0.0.0.0', 'new': '0.0.0.1'}, " | |
"'port': {'previous': 8000, 'new': 8001}, " | |
"'location': {'previous': 'EveryNode', 'new': 'HeadOnly'}}" | |
) in error |
…meters for the 'serve' API Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
…meters for the 'serve' API Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CORE stamp.
Hi @jjyao ! Thank the approval. This PR is still WIP. Could you please clarify what |
Means core side change looks good and approved. |
Why are these changes needed?
Global Serve configuration parameters such as
proxy_location
andhttp_options
are not supposed to be updated via the Serve API:serve deploy
and REST APIPUT "/api/serve/applications/"
. When users try to update them, REST APIPUT "/api/serve/applications/"
ignores the change with the warning in the/tmp/ray/session_latest/logs/dashboard_ServeHead.log
file. The warning has the following format:While
serve deploy
silently ignores the changes, which may cause confusion for users.This PR introduces:
RayServeConfigException
in the_check_http_options
function inserve._private.api
validate_http_options
function in thedashboard.modules.serve.serve_head
as it's not needed anymoreserve.api
of using wrong default value forDeploymentMode
in case of emptyproxy_location
and non emptyhttp_options
(old implementation fell back toHeadOnly
instead of the current default -EveryNode
)User scenario:
hello_world.py
:Output before the change:
Output after the change:
The same behavior for serve REST API:
A thing worth to mention here is this change makes explicit the discrepancy between default
host
value in serve.config.HTTPOptions (host="127.0.0.1") vs serve.schema.HTTPOptionsSchema (host="0.0.0.0").serve.config.HTTPOptions
is primarily used in servestart
CLI and Python APIs andserve.schema.HTTPOptionsSchema
is used indeploy/run/build
APIs.Previously, when users use default
start
and thendeploy
orrun
- http_options from start were used.Now we explicitly failing in this scenario:
This behavior can be annoying for users, but now it's explicit. Maybe we can consider to align default values in
HTTPOptions
andHTTPOptionsSchema
.Related issue number
Closes #56163
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.Note
Disallow runtime updates to global
http_options
/proxy_location
by raisingRayServeConfigException
, fix default proxy location handling, refactor validation, and update docs/tests/CLI accordingly.RayServeConfigException
on attempts to change globalhttp_options
/proxy_location
via_check_http_options
(now compares againstclient.http_config
, normalizesProxyLocation
/DeploymentMode
)._prepare_http_options
setsDeploymentMode.EveryNode
whenproxy_location=None
;start()
uses it.serve_start
/serve_start_async
to passclient.http_config
into_check_http_options
.RayServeConfigException
inray.serve.exceptions
.validate_http_options
warning logic fromserve_head.py
; rely on backend check._prepare_http_options
andserve.start
rejection on changed HTTP config; add CLI test verifying detailed diff in error.proxy_location
and HTTP/gRPC configs are cluster-global and cannot be updated at runtime.Written by Cursor Bugbot for commit 1f690a6. This will update automatically on new commits. Configure here.