Description
Describe your feature request
Currently ray attach
only allows opening an SSH session on the head node. It could be useful to allow attaching to worker nodes to check what state the execution environment and file system are in (e.g. running conda list
, examining config files such as ~/.keras/keras.json
).
Technically this also applies to ray exec
, but in my experience the use cases are much less convincing.
Existing alternatives
ray list-worker-ips
is subpar since it doesn't list the necessary SSH key location + it's tedious to type out a long ssh command every time.
A workaround is using awless ssh
with an amazon instance ID, but this will open a raw ssh session while attach
runs a screen
or tmux
; and awless
does not work with Kubernetes and another autoscaler backends.
Suggested API
ray attach <cluster name> --ip <node ip>
since ray prints the IP if a node has issues.
Alternative: ray attach <cluster name> --node-id <node id>
since IPs are long and node IDs are very short.