Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Flux batch job HOWTO section #57

Merged
merged 2 commits into from
Aug 19, 2020
Merged

Conversation

dongahn
Copy link
Member

@dongahn dongahn commented Aug 8, 2020

Add doc on how to submit your batch scripts with
some examples.

Use single-user Flux examples with a caveat that
same commands and techniques will also work at the system level.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Just added a couple inline notes. I thought it might be good to clarify what a "batch job" means in Flux (as compared to a submitted job).

Maybe what we need is more of a glossary page, where we delineate these terms, and to which we can add a reference from these HOWTOs.

batch.rst Outdated Show resolved Hide resolved
batch.rst Show resolved Hide resolved
@dongahn
Copy link
Member Author

dongahn commented Aug 10, 2020

From the initial feedback from Xiaohua, there are a few things that I'd like to add to this:

  1. Explain one needs to drop --pty for non-interactive use and add rundir option to redirect content store
  2. The user was confused that they needed to use either a HERE doc or a separate script to flux start for non-interactive use. I will add this.
  3. Use of flux queue drain
  4. Blocking vs. non-blocking (e.g., don't background flux mini batch; the semantics of flux mini run is foreground although backgrounding it is supported).

@jameshcorbett
Copy link
Member

jameshcorbett commented Aug 11, 2020

This looks great, Dong. My only comments are things you've already pointed out:

I think it would be good to point out that there are two ways, blocking and nonblocking, to submit batch jobs and regular jobs: alloc, batch, run, and submit.

You might also want to provide a full example of using Flux on Slurm with sbatch, which I imagine might look something like:

outer_script.sh:

#!/bin/sh
#SBATCH -N 8
#SBATCH etc

srun ... flux start inner_script.sh

inner_script.sh:

#!/bin/sh
flux mini submit -n5 hostname
flux mini submit -n17 spam --foo --bar

flux queue drain

And you submit this to Slurm with sbatch outer_script.sh. When you already have a Flux instance going, this would be much simpler---get rid of outer_script.sh and just execute flux mini batch -n8 -N8 [-c N] inner_script.sh. In fact, I think it may be worth going over that difference.

@dongahn dongahn force-pushed the mini-batch branch 5 times, most recently from 83eda2d to e049524 Compare August 15, 2020 03:39
@dongahn
Copy link
Member Author

dongahn commented Aug 15, 2020

OK. I have addressed all of the reviewers' (great) comments and incorporated the finings from working with users. I squashed all the interim commits. From my perspective, this is good to go in.

Add doc on how to submit your batch scripts with
some examples.

Use single-user Flux examples with a caveat that
same commands and techniques will also work at the system level.
@dongahn
Copy link
Member Author

dongahn commented Aug 19, 2020

Ok. I resolved the conflicts and forced a push. It will be good if we can merge this sooner rather than later. We already have some users needing this doc.

Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@SteVwonder SteVwonder added the merge-when-passing mark PR for auto-merging by mergify.io bot label Aug 19, 2020
@mergify mergify bot merged commit 7b3c000 into flux-framework:master Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mark PR for auto-merging by mergify.io bot
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants