-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multinode tests by simulating multiple nodes using Docker. #378
Conversation
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test FAILed. |
Test FAILed. |
After doing
The |
After this PR is merged, all PR's should be required to pass the Travis tests as well as the Jenkins tests (the multi-node docker stuff in this PR is only being run on Jenkins). |
The script |
Merged build finished. Test PASSed. |
Test PASSed. |
d.start_ray(mem_size=args.mem_size, shm_size=args.shm_size, | ||
num_nodes=args.num_nodes, docker_image=args.docker_image, | ||
development_mode=args.development_mode) | ||
time.sleep(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas on a better way to know that Ray has started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, now that I think about it, it should be possible to just go ahead and run the test script even if Ray hasn't started (if Ray hasn't started yet, then the call to ray.init
should just retry as needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried it out, it works like I said with the caveat that ray.init
only retries for a few seconds and then raises an exception.
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
For testing we want to be able to simulate a Ray cluster using a number of Docker instances running on a single host. This change adds scripts to boot a cluster matching a specific configuration and to run a test script on that cluster.
Other requirements: