-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tutorials: add a tutorial on submitting jobs to Flux #194
tutorials: add a tutorial on submitting jobs to Flux #194
Conversation
immediate thought is if we should cover lots of variants and common options:
so this might become more more "job submission and management basics"?? |
Maybe this could be one of those documents like you proposed @chu11 where it starts simple and adds complexity? If the topic is "job submission" Maybe it should try to stay focused on getting jobs into the system? |
Good point, perhaps this specific doc should be simpler then, no need to explain getting |
This one is under "command tutorials". So maybe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments! And I guess this depends on merging the other ssh PR?
tutorials/commands/index.rst
Outdated
with your use case, and then see detailed usage. | ||
|
||
- ``flux proxy`` (:ref:`ssh-across-clusters`): "Send commands to a flux instance across clusters using ssh" | ||
- ``job-submit`` (:ref:`job-submit`): "Submit a job in a Flux instance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be flux mini submit
? I think here we want to start to help to make associations between the actual command to be run and the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also why is this showing up as a new file? Do we just need to merge the other PR? Ping @grondo (but when you have time I know there are issues with Flux atm on a cluster!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand what you mean, I think the other PR needs to be merged first. This PR just contains the other PR's commits which is why the file is new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that is what I was trying to say, poorly.
tutorials/commands/job-submit.rst
Outdated
@@ -0,0 +1,55 @@ | |||
.. _job-submit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. _job-submit: | |
.. _flux-mini-submit: |
tutorials/commands/job-submit.rst
Outdated
|
||
$ flux mini submit --nodes=2 --ntasks=4 --cores-per-task=2 ./my_compute_script.lua 120 | ||
ƒM5k8m7m | ||
$ flux mini submit --nodes=1 --ntasks=1 --cores-per-task=2 ./my_other_script.lua 120 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we show them a new command, it would be good below it to say "in the above, we are asking for " so the reader starts to make sense of the options/args too.
tutorials/commands/job-submit.rst
Outdated
$ flux mini submit --nodes=1 --ntasks=1 --cores-per-task=2 ./my_other_script.lua 120 | ||
ƒSUEFPDH | ||
|
||
A jobID is returned for every job submitted. You can view the status of your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A jobID is returned for every job submitted. You can view the status of your | |
A jobID (e.g., ``ƒSUEFPDH``) is returned for every job submitted. You can view the status of your |
And do we have a term for this in the new terms guide? If yes - let's link!
tutorials/commands/job-submit.rst
Outdated
|
||
.. code-block:: sh | ||
|
||
$ flux job info ƒM5k8m7m R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to get this in a table (non json) - this was a question I had the other day and I couldn't figure out from the command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think actually we should not advertise flux job info
in a high-level tutorial. It is more of a "plumbing" command. Instead we should focus on flux jobs
, the main interface users will use to get information about their jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Sorry for including it in this tutorial. I'll go ahead and just remove this section then and only include flux jobs
.
tutorials/commands/job-submit.rst
Outdated
$ flux job info ƒM5k8m7m R | ||
{"version":1,"execution":{"R_lite":[{"rank":"0-1","children":{"core":"0-3"}}]}} | ||
|
||
There are a number of keys you can pass to get various information about your job: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what the listing below is from, or used for? It would be good to add the context, and then description for what each of the below actually means!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the reason why it is probably best to leave flux job info
out of user tutorials. An advanced tutorial could perhaps explain this command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flux job info
is a kind of plumbing command, not very user friendly, and probably shouldn't be here IMHO
tutorials/commands/index.rst
Outdated
with your use case, and then see detailed usage. | ||
|
||
- ``flux proxy`` (:ref:`ssh-across-clusters`): "Send commands to a flux instance across clusters using ssh" | ||
- ``job-submit`` (:ref:`job-submit`): "Submit a job in a Flux instance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this a "command tutorial" should the bullet reference the flux mini submit
command?
- ``job-submit`` (:ref:`job-submit`): "Submit a job in a Flux instance" | |
- ``flux mini submit`` (:ref:`job-submit`): "Submit a job in a Flux instance" |
tutorials/commands/job-submit.rst
Outdated
|
||
.. code-block:: sh | ||
|
||
$ flux mini submit --nodes=2 --ntasks=4 --cores-per-task=2 ./my_compute_script.lua 120 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was thinking if there are any other important options to mention, the only one I could think of is --queue
.
IMO, that's more important than --cores-per-task
. You could probably just mention that there are many advanced ways to request resources besides --nodes
and --ntasks
, and to see the manpage for details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are open to it, I like having (toward the end) a big block of just examples with description, that sort of show all the options that a flux command can provide. E.g., I think I linked this before, but this example comes to mind! https://rse-ops.github.io/knowledge/docs/schedulers/slurm.html#command-quick-reference. Sometimes people's eyes will glaze over the text and they just want to find the right thing to copy paste :) (guilty!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh that's a good idea
79a3f3a
to
637e9f3
Compare
Thanks for all the great feedback and suggestions everybody! I've just force pushed some changes to the tutorial based on the suggestions above. To summarize, I've made the following changes:
Thanks again for the feedback. I think I might as well take this out of [WIP], but note that we should probably land #192 before this one, since this PR is built on top of #192. |
------------------------------------- | ||
More Examples of Submitting Flux Jobs | ||
------------------------------------- | ||
|
||
.. code-block:: sh | ||
|
||
$ flux mini submit --nodes=2 --queue=foo --name=my_special_job ./my_job.lua | ||
|
||
This submits a job to the `foo` queue across two nodes, and sets a custom name | ||
to the job. | ||
|
||
.. code-block:: sh | ||
|
||
$ flux mini submit --dry-run ./my_cool_job.lua |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this. I will suggest we have one example with the --output
option, as I imagine many would want that. It'd perhaps be wise to illustrate use of {{id}}
in the output option too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a good idea, thanks. I'll add an example that includes this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just force-pushed a commit that adds an example including --output
and {{id}}
when submitting a job.
637e9f3
to
1bb28a5
Compare
|
||
.. code-block:: sh | ||
|
||
$ flux mini submit --nodes=2 --ntasks=4 --cores-per-task=2 ./my_compute_script.lua 120 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit, should we do .py
everywhere instead of .lua
? since python is more popular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sure! Good call. Just force-pushed a fix to use .py
instead of .lua
1bb28a5
to
69ab2ca
Compare
@cmoussa1 the ssh tutorial is merged! You should be able to rebase locally and then we can finish up review here and get the tutorial in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Add a small tutorial on submitting jobs to Flux via "flux mini submit". To start, add a couple simple examples of submitting jobs to Flux using different options.
69ab2ca
to
2ebcf82
Compare
This is a small [WIP] PR built on top of #192 that adds a tutorial on how to submit jobs to Flux. It leverages the steps outlined in the job-submit-cli workflow example. I ran the commands outlined in an updated Docker container to make sure they still worked (I could always use another set of eyes to double-check me, though 😉).
Future expansion of this specific tutorial could include a chapter on how to submit jobs to Flux using it's job submission API. But for the purposes of getting some short tutorials out there, I've just included the command-line portion.