Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] When reconstruction is enabled, pin objects created by ray.put() #8021

Merged

Conversation

stephanie-wang
Copy link
Contributor

Why are these changes needed?

When lineage reconstruction is enabled, Ray will attempt to recreate objects that were stored in plasma and lost due to a failure through task re-execution. However, this will not work if the task depends on an object created by ray.put, since the caller would also have to be restarted.

This PR pins objects that are created by ray.put so that they will not be evicted until all downstream tasks can not be executed again. This guarantees that the object value will be available if lineage reconstruction is needed.

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/latest/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested (please justify below)

@stephanie-wang stephanie-wang requested a review from ericl April 14, 2020 20:35
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@ericl ericl self-assigned this Apr 14, 2020
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24703/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24709/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24818/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24836/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24969/
Test FAILed.

@stephanie-wang stephanie-wang merged commit 1323e17 into ray-project:master Apr 20, 2020
@stephanie-wang stephanie-wang deleted the pin-unreconstructable branch April 20, 2020 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants