Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TOLIT-1928
This pull-request addresses MEMLIMIT errors found on BWAMEM2_MEM and SAMTOOLS_SORMADUP.
For both processes, the jobs were retried 5 times (as per
maxRetries
) and failed at every attempt despite the resources (CPUs and memory) increasing.This is because in both cases, the memory is a function of the number of CPUs: as the number of CPUs increases, so does the memory. But the config was such that the memory-per-CPU ratio was not increasing, or not fast enough, so retries could not address the initial MEMLIMIT problem.
The changes I propose are:
In both cases, I'm not changing the baseline, but rather making sure that the second attempt gives enough memory for the job to succeed.
bwamem2_mem
Initial failure:
/lustre/scratch122/tol/share/weskit/data/prod/03ba/03ba5c76-fccd-49da-a0cc-682362128eb1/exc_priority_1_readmapping/.nextflow.log
and/lustre/scratch122/tol/share/weskit/data/prod/03ba/03ba5c76-fccd-49da-a0cc-682362128eb1/exc_priority_1_readmapping/work/bf/2ab57f0e97ef957747d31766cf5cf2/.command.log
Successful run:
/lustre/scratch123/tol/teams/tolit/users/mm49/nextflow/rm/bwamem2
First attempt
/lustre/scratch123/tol/teams/tolit/users/mm49/nextflow/rm/bwamem2/work/86/e2ed519694856ac74b17ba380a5118/.command.log
(12 CPUs, 15.4 G RAM) failed because of MEMLIMITSecond attempt
/lustre/scratch123/tol/teams/tolit/users/mm49/nextflow/rm/bwamem2/work/ce/664d8f5406e98fb33390c3a966ce0d/.command.log
(18 CPUs, 34.1 GB RAM) succeeded (22.8 GB RAM used)sormadup
Initial failure:
/lustre/scratch122/tol/share/weskit/data/prod/ea13/ea1384a3-41d8-43ee-96d8-915731b14164/exc_priority_1_readmapping/.nextflow.log
and/lustre/scratch122/tol/share/weskit/data/prod/ea13/ea1384a3-41d8-43ee-96d8-915731b14164/exc_priority_1_readmapping/work/7c/a7e6a393747ba7456d9174450850d3/.command.log
Successful run:
/lustre/scratch123/tol/teams/tolit/users/mm49/nextflow/rm/sormadup
First attempt
/lustre/scratch123/tol/teams/tolit/users/mm49/nextflow/rm/sormadup/work/ed/b2b47555f5ad665760931e4ff92a33/.command.log
(8 CPUs, 14.8 G RAM) failed because of MEMLIMITSecond attempt
/lustre/scratch123/tol/teams/tolit/users/mm49/nextflow/rm/sormadup/work/33/3f71dacd11dc58cd1605018dc24625/.command.log
(14 CPUs, 31.4 GB RAM) succeeded (27.2 GB RAM used)I will make a release (v1.2.1) right after this pull-request is merged
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).