-
Notifications
You must be signed in to change notification settings - Fork 119
Description
Issue #3422 and the resulting fix now strip all SBATCH_* variables out of the env.
In our case we are developing a containerized environment that propagates the submit to run environment "seamlessly" via S[BATCH|RUN|ALLOC]_CONTAINER (https://dl.acm.org/doi/10.1145/3731599.3767355).
The current workflow is to run reframe -r from within the container environment on a login node and then it should propagate to all the jobs submitted via reframe.
This results in having to do something like the following config:
IMAGE = os.environ["NERSC_IMAGE"]
zen3_a100_ofi = {
"name": "zen3-a100-ofi",
"descr": "Submit jobs through the system Slurm scheduler",
"scheduler": "slurm",
"launcher": "srun",
"access": ["--qos=regular", "--constraint=gpu", f"--container={IMAGE}"],
"environs": ["builtin", "prgenv-gnu"],
}The resulting job scripts then have this explicitly, but their sruns do not which leads to a potential divergence of behavior if one is keeping the staged files and inspecting those manually.
Ideally Slurm will have a better design for how these options interact, but for now it would be useful to us to have some ability to control which variables are removed. For example: a simple exclude list or ability to override a regular expression
Metadata
Metadata
Assignees
Type
Projects
Status