Diagnosing NODE_FAILURE
when using NextFlow on Cheaha
#672
Labels
feat: faq (ask.ci)
https://ask.cyberinfrastructure.org/c/locales-data-centers-and-campus-rc/uab/52
What would you like to see added?
If the following conditions are true, then consider that one or more NextFlow tasks may have insufficient memory allocated. Assume
$jobid
is the Slurm Job ID for the relevant NextFlow task.sacct -j $jobid -X -o jobid,state
showsNODE_FAILURE
.exitcode
does not exist in the NextFlow task's working directory. Working directory here refers to the NextFlow concept.When we have encountered researcher workflows where the above are true, the cause of the error has invariably been due to an "Out of Memory" (OOM) event.
The text was updated successfully, but these errors were encountered: