Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xray] Fix heartbeat subscription for autoscaler #2498

Merged
merged 20 commits into from
Jul 28, 2018

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Jul 27, 2018

What do these changes do?

Subscribe to all the redis shards instead of just the primary. Fix the error message to be nicer when we don't get heartbeats.

Tested in both modes.

Related issue number

#2485

@ericl ericl changed the title [xray] Fix heartbeat subscription for autoscaler [WIP] [xray] Fix heartbeat subscription for autoscaler Jul 27, 2018
@ericl ericl changed the title [WIP] [xray] Fix heartbeat subscription for autoscaler [xray] Fix heartbeat subscription for autoscaler Jul 27, 2018
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6923/
Test FAILed.

@@ -605,7 +598,7 @@ def run(self):
self.subscribe(LOCAL_SCHEDULER_INFO_CHANNEL)
self.subscribe(PLASMA_MANAGER_HEARTBEAT_CHANNEL)
self.subscribe(DRIVER_DEATH_CHANNEL)
self.subscribe(XRAY_HEARTBEAT_CHANNEL)
self.subscribe(ray.gcs_utils.TablePubsub.HEARTBEAT, primary=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only one where we need to subscribe to the shards, so I added primary=False.

@@ -518,24 +519,16 @@ def process_messages(self, max_messages=10000):

# Determine the appropriate message handler.
message_handler = None
if not self.subscribed[channel]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat unrelated to this PR, but I removed this whole self.subscribed field. That was only there because I didn't know about the ignore_subscribe_messages=True at the time.

@robertnishihara
Copy link
Collaborator

@ericl I pushed some small changes, can you take a look?

Copy link
Contributor Author

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6944/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6941/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6942/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6957/
Test PASSed.

@robertnishihara robertnishihara merged commit 90a3ea9 into ray-project:master Jul 28, 2018
@robertnishihara robertnishihara deleted the fix-autoscaling-pr branch July 28, 2018 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants