Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

balance load over nodes #5077

Open
macmanes opened this issue Aug 29, 2024 · 4 comments
Open

balance load over nodes #5077

macmanes opened this issue Aug 29, 2024 · 4 comments

Comments

@macmanes
Copy link

macmanes commented Aug 29, 2024

Using toil from within cactus and a slurm scheduled.

I have 10 nodes available to me and each of them has 40 cores and 500Gb RAM. If I submit 100 jobs, TOIL will submit the jobs to 3 nodes. In my particular case, this is causing oom-kill issues. Is there a way to balance the load - to submit 100 jobs spread evenly over the 10 available nodes?

Thanks in advance for any help available.

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1638

@DailyDreaming
Copy link
Member

Is there anything particular to the nodes not being assigned? Like, are they partitioned differently, or are fulfilling a GPU requirement? Or are all nodes created equal? Toil does try to detect the overhead of the machines but might not be aware of some intensive background task either.

I'll look into some of the options for Slurm scaling. That sounds odd to me.

@macmanes
Copy link
Author

macmanes commented Sep 3, 2024

Yes, all nodes and weights are created equally.

I'm wondering if LLN=YES as an argument to slurm is what I want..
https://slurm.schedmd.com/slurm.conf.html#OPT_LLN

@adamnovak
Copy link
Member

You can use the TOIL_SLURM_ARGS environment variable to add extra command line options to Toil's Slurm calls, but Toil doesn't specifically tell Slurm to pack jobs as tightly as it can into the fewest nodes. I think that might be Slurm's default behavior, since it is designed under the assumption that it is pretty common to want to reserve an entire node for a Slurm job.

If your Slurm jobs are getting OOM-killed, are you sure that your memory limits assigned to your jobs in Cactus are accurate? If they are too low, Slurm I think should detect that you are trying to go over them and OOM-kill your jobs, even if there is free memory on the node not allocated to any jobs.

@adamnovak
Copy link
Member

It looks like LLN is something a Slurm administrator might need to configure for a whole partition, and not an option you can pass to sbatch. Toil's jobs aren't sent to Slurm as an array or as a single Slurm-level batch, so things for scheduling e.g. different instances of an array job on different nodes won't help either.

If your Toil jobs are large enough, you can add the --exclusive option to TOIL_SLURM_ARGS, so that each job will request an entire node to itself. But I don't think Cactus will run well like that; it likes to run a lot of small jobs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants