-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[xray] Fix heartbeat subscription for autoscaler #2498
[xray] Fix heartbeat subscription for autoscaler #2498
Conversation
Test FAILed. |
38a88af
to
b178121
Compare
python/ray/monitor.py
Outdated
@@ -605,7 +598,7 @@ def run(self): | |||
self.subscribe(LOCAL_SCHEDULER_INFO_CHANNEL) | |||
self.subscribe(PLASMA_MANAGER_HEARTBEAT_CHANNEL) | |||
self.subscribe(DRIVER_DEATH_CHANNEL) | |||
self.subscribe(XRAY_HEARTBEAT_CHANNEL) | |||
self.subscribe(ray.gcs_utils.TablePubsub.HEARTBEAT, primary=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only one where we need to subscribe to the shards, so I added primary=False
.
python/ray/monitor.py
Outdated
@@ -518,24 +519,16 @@ def process_messages(self, max_messages=10000): | |||
|
|||
# Determine the appropriate message handler. | |||
message_handler = None | |||
if not self.subscribed[channel]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat unrelated to this PR, but I removed this whole self.subscribed
field. That was only there because I didn't know about the ignore_subscribe_messages=True
at the time.
@ericl I pushed some small changes, can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Test FAILed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
What do these changes do?
Subscribe to all the redis shards instead of just the primary. Fix the error message to be nicer when we don't get heartbeats.
Tested in both modes.
Related issue number
#2485