[Doc] [Job] Add notes about where Ray Job entrypoint runs and how to specify it #41319

architkulkarni · 2023-11-21T22:47:14Z

Why are these changes needed?

There is recurring user confusion about where the job entrypoint script runs and how to make it run on a worker node.

This PR adds the missing information to the doc in relevant places in the tutorials, and includes it in the FAQ.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

GeneDer

doc/source/cluster/running-applications/job-submission/quickstart.rst

kevin85421 · 2023-11-21T23:40:00Z

doc/source/cluster/running-applications/job-submission/quickstart.rst

@@ -111,12 +111,19 @@ Make sure to specify the path to the working directory in the ``--working-dir``
    # Job 'raysubmit_inB2ViQuE29aZRJ5' succeeded
    # ------------------------------------------

-This command will run the script on the Ray Cluster and wait until the job has finished. Note that it also streams the stdout of the job back to the client (``hello world`` in this case). Ray will also make the contents of the directory passed as `--working-dir` available to the Ray job by downloading the directory to all nodes in your cluster.
+This command will run the entrypoint script on the Ray Cluster's head node and wait until the job has finished. Note that it also streams the stdout of the job back to the client (``hello world`` in this case). Ray will also make the contents of the directory passed as `--working-dir` available to the Ray job by downloading the directory to all nodes in your cluster.


It not only streams STDOUT but also STDERR. You can do a simple experiment:

# example.py import sys # Print a message to stdout print("This is a message to stdout") # Print a message to stderr print("This is an error message to stderr", file=sys.stderr)

Good point. I just want to convey that it would stream whatever is normally output to the terminal when you run a command in your local terminal. Should I say "streams the stdout and stderr" or "streams the output"?

Maybe the former? A lot of commands/tools do not stream stderr by default.

Updated to "streams the output of the entrypoint script", which should be clear. 61d65ff

Oh sorry, didn't see your message. Updated to "streams the stdout and stderr" ab7950c

kevin85421 · 2023-11-21T23:41:59Z

doc/source/cluster/running-applications/job-submission/quickstart.rst


 .. note::

    The double dash (`--`) separates the arguments for the entrypoint command (e.g. `python script.py --arg1=val1`) from the arguments to `ray job submit`.

+.. note::
+
+    By default the entrypoint script is run on the head node. To override this, specify any of the arguments 


"entrypoint script is run on the head node" => Do you mean the driver process would be running on the head node by default?

Yes, we say entrypoint script here to convey that it is running whatever the user specifies as entrypoint. Typically this is a script that starts a Ray driver process (ray.init()), but it could also be any command at all, like echo hello && pip install something. It technically doesn't have to involveRay

Short answer, yes, the driver is running on the head node by default

…art.rst Co-authored-by: Kai-Hsun Chen <kaihsun@apache.org> Signed-off-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

…i/ray into docs-job-head-node Signed-off-by: Archit Kulkarni <archit@anyscale.com>

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

Quick follow to #41319 --------- Signed-off-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Co-authored-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>

…specify it (ray-project#41319) There is recurring user confusion about where the job entrypoint script runs and how to make it run on a worker node. This PR adds the missing information to the doc in relevant places in the tutorials, and includes it in the FAQ. --------- Signed-off-by: Archit Kulkarni <archit@anyscale.com> Signed-off-by: Archit Kulkarni <architkulkarni@users.noreply.github.com> Co-authored-by: Kai-Hsun Chen <kaihsun@apache.org>

…ct#41342) Quick follow to ray-project#41319 --------- Signed-off-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Co-authored-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>

Add notes about where Ray Job entrypoint runs and how to specify it

e3d6d79

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

architkulkarni requested a review from a team November 21, 2023 22:47

architkulkarni assigned scottsun94 and angelinalg Nov 21, 2023

architkulkarni requested review from maxpumperla, pcmoritz, kevin85421 and a team as code owners November 21, 2023 22:47

GeneDer approved these changes Nov 21, 2023

View reviewed changes

kevin85421 reviewed Nov 21, 2023

View reviewed changes

architkulkarni and others added 5 commits November 21, 2023 15:58

Update doc/source/cluster/running-applications/job-submission/quickst…

c1282ba

…art.rst Co-authored-by: Kai-Hsun Chen <kaihsun@apache.org> Signed-off-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>

Change stdout to "output"

8e194d0

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

Merge branch 'docs-job-head-node' of https://github.com/architkulkarn…

4ca14cd

…i/ray into docs-job-head-node Signed-off-by: Archit Kulkarni <archit@anyscale.com>

Change the text to "streams the output"

61d65ff

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

Use stdout and stderr

ab7950c

Signed-off-by: Archit Kulkarni <archit@anyscale.com>

kevin85421 approved these changes Nov 22, 2023

View reviewed changes

architkulkarni merged commit 80a1770 into ray-project:master Nov 22, 2023
2 checks passed

angelinalg mentioned this pull request Nov 22, 2023

[docs] copy edit of Job Submission Getting Started and FAQ #41342

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] [Job] Add notes about where Ray Job entrypoint runs and how to specify it #41319

[Doc] [Job] Add notes about where Ray Job entrypoint runs and how to specify it #41319

architkulkarni commented Nov 21, 2023

GeneDer left a comment

kevin85421 Nov 21, 2023

architkulkarni Nov 22, 2023

kevin85421 Nov 22, 2023

architkulkarni Nov 22, 2023

architkulkarni Nov 22, 2023

kevin85421 Nov 21, 2023

architkulkarni Nov 22, 2023

architkulkarni Nov 22, 2023

kevin85421 Nov 22, 2023

[Doc] [Job] Add notes about where Ray Job entrypoint runs and how to specify it #41319

[Doc] [Job] Add notes about where Ray Job entrypoint runs and how to specify it #41319

Conversation

architkulkarni commented Nov 21, 2023

Why are these changes needed?

Related issue number

Checks

GeneDer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment