-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[xray] [autoscaler] Fix autoscaler / raylet integration #2143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test FAILed. |
Test PASSed. |
Tested this again on latest and it seems to work. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
python/ray/monitor.py
Outdated
return | ||
|
||
clients = self.state.client_table() | ||
if type(clients) is list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note the line
if self.use_raylet:
return
which I added recently to avoid running this method for xray. We probably want to remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't see that. Removed this.
python/ray/experimental/state.py
Outdated
for client in client_list: | ||
if (client["ClientType"] == "local_scheduler" | ||
and not client["Deleted"]): | ||
if self.use_raylet: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep local_schedulers()
as part of the API? It seems somewhat redundant with client_table()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it seems like a nice helper method to have. I suppose after all the old code is deleted this will all look nicer.
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
ping |
Hopefully, yapf 0.22.0 produces the same output as travis |
Test FAILed. |
python/ray/experimental/state.py
Outdated
if (client["ClientType"] == "local_scheduler" | ||
and not client["Deleted"]): | ||
if self.use_raylet: | ||
for client in clients: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably not do the right thing in the case where machines are removed from the cluster (because those machines will then appear multiple times in clients
, I think.
However, given the way that the autoscaler uses this, it probably won't matter for the autoscaler.
python/ray/experimental/state.py
Outdated
and not client["Deleted"]): | ||
if self.use_raylet: | ||
for client in clients: | ||
if "Resources" in client: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think every entry of clients
is a raylet, so this if
statement is unnecessary. @stephanie-wang is that correct?
Test PASSed. |
@robertnishihara removed the call to local_schedulers() now. Retested and it seems to work. |
Based on: #2125
What do these changes do?
This adds a handler in
monitor.py
to handle raylet heartbeats. These provide the resource load stats needed to run the autoscaler.Tested with the following config: