balance load over nodes #5077

macmanes · 2024-08-29T12:12:22Z

Using toil from within cactus and a slurm scheduled.

I have 10 nodes available to me and each of them has 40 cores and 500Gb RAM. If I submit 100 jobs, TOIL will submit the jobs to 3 nodes. In my particular case, this is causing oom-kill issues. Is there a way to balance the load - to submit 100 jobs spread evenly over the 10 available nodes?

Thanks in advance for any help available.

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1638

DailyDreaming · 2024-09-03T17:19:51Z

Is there anything particular to the nodes not being assigned? Like, are they partitioned differently, or are fulfilling a GPU requirement? Or are all nodes created equal? Toil does try to detect the overhead of the machines but might not be aware of some intensive background task either.

I'll look into some of the options for Slurm scaling. That sounds odd to me.

macmanes · 2024-09-03T17:30:22Z

Yes, all nodes and weights are created equally.

I'm wondering if LLN=YES as an argument to slurm is what I want..
https://slurm.schedmd.com/slurm.conf.html#OPT_LLN

adamnovak · 2024-09-11T21:40:18Z

You can use the TOIL_SLURM_ARGS environment variable to add extra command line options to Toil's Slurm calls, but Toil doesn't specifically tell Slurm to pack jobs as tightly as it can into the fewest nodes. I think that might be Slurm's default behavior, since it is designed under the assumption that it is pretty common to want to reserve an entire node for a Slurm job.

If your Slurm jobs are getting OOM-killed, are you sure that your memory limits assigned to your jobs in Cactus are accurate? If they are too low, Slurm I think should detect that you are trying to go over them and OOM-kill your jobs, even if there is free memory on the node not allocated to any jobs.

adamnovak · 2024-09-11T21:45:37Z

It looks like LLN is something a Slurm administrator might need to configure for a whole partition, and not an option you can pass to sbatch. Toil's jobs aren't sent to Slurm as an array or as a single Slurm-level batch, so things for scheduling e.g. different instances of an array job on different nodes won't help either.

If your Toil jobs are large enough, you can add the --exclusive option to TOIL_SLURM_ARGS, so that each job will request an entire node to itself. But I don't think Cactus will run well like that; it likes to run a lot of small jobs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

balance load over nodes #5077

balance load over nodes #5077

macmanes commented Aug 29, 2024 •

edited by unito-bot

Loading

DailyDreaming commented Sep 3, 2024

macmanes commented Sep 3, 2024

adamnovak commented Sep 11, 2024

adamnovak commented Sep 11, 2024

balance load over nodes #5077

balance load over nodes #5077

Comments

macmanes commented Aug 29, 2024 • edited by unito-bot Loading

DailyDreaming commented Sep 3, 2024

macmanes commented Sep 3, 2024

adamnovak commented Sep 11, 2024

adamnovak commented Sep 11, 2024

macmanes commented Aug 29, 2024 •

edited by unito-bot

Loading