Skip to content

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Aug 18, 2018

What do these changes do?

This adds some experimental (undocumented) support for launching Ray on existing nodes. You have to provide the head ip, and the list of worker ips.

There are also a couple additional utils added for rsyncing files and port-forward.

@richardliaw richardliaw mentioned this pull request Aug 18, 2018
1 task
@ericl
Copy link
Contributor Author

ericl commented Aug 18, 2018

cc @hartikainen

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7575/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7576/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7582/
Test PASSed.

Copy link
Contributor

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, would be nice to have some documentation.

Also are there going to be tests for this?

required=False,
type=str,
help=("Override the configured cluster name."))
def rsync_down(cluster_config_file, source, target, cluster_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some descriptors for these functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not currently done in this file. Do those show up in ray --help?

@@ -89,11 +89,21 @@ You can use ``ray attach`` to attach to an interactive console on the cluster.
Port-forwarding applications
----------------------------

To run connect to applications running on the cluster (e.g. Jupyter notebook) using a web browser, you can forward the port to your local machine using SSH:
To run connect to applications running on the cluster (e.g. Jupyter notebook) using a web browser, you can use the port-forward option for ``ray exec``:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a note that the port opened on the local machine is the same as the port forwarded on the remote machine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

head_setup_commands: []
worker_setup_commands: []
setup_commands:
- source activate ray && test -e ray || git clone https://github.com/YOUR_GITHUB/ray.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just a pip install?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These setup commands don't seem to actually install ray, do they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is for dev only for now...

from ray.autoscaler.tags import TAG_RAY_NODE_TYPE


class ClusterState(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide some docstrings for this class?

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7590/
Test PASSed.

@robertnishihara
Copy link
Collaborator

cc @devin-petersohn @pschafhalter

workers = json.loads(open(self.save_path).read())
else:
workers = {}
print("Loaded cluster state", workers)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using the logging module for new print statements?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do that separately in a PR (#2628)

Manually synchronizing files
----------------------------

To download or upload files to the cluster head node, use ``ray rsync_down`` or ``ray rsync_up``:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is syncing to/from the head node the primary use case, as opposed to syncing to/from all nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, handling other nodes is out of scope here.

@robertnishihara
Copy link
Collaborator

@ericl does this require any special instructions for what IP addresses to use when you have a public/private IP address distinction (like on EC2)?

@ericl
Copy link
Contributor Author

ericl commented Aug 19, 2018

@robertnishihara I'm assuming all nodes have just one IP for now.

@ericl ericl merged commit 9473da6 into ray-project:master Aug 19, 2018
@ericl
Copy link
Contributor Author

ericl commented Aug 19, 2018

Merging so we can start testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants