Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReblockGVCFs requires less memory for WGS samples when no interval list is provided #1251

Merged
merged 4 commits into from
Apr 3, 2024

Conversation

meganshand
Copy link
Contributor

Description

When a calling interval list is not provided to ReblockGVCFs, ValidateVariants takes a great deal of memory on WGS samples. When the WGS sample was processed with DRAGEN, there is no calling interval list, but DRAGEN skips some regions of the genome, so the reblocked GVCF does not validate unless it is using the input GVCF as the interval list.

We fix the need for a large amount of memory in this case by first converting the GVCF input into an interval list (where abutting intervals are merged together). In the future, GATK might be able to solve this issue internally (see broadinstitute/gatk#8608), but for now this workaround should fix the memory issue.


Checklist

If you can answer "yes" to the following items, please add a checkmark next to the appropriate checklist item(s) and notify our WARP documentation team by tagging either @ekiernan or @kayleemathews in a comment on this PR.

  • Did you add inputs, outputs, or tasks to a workflow?
  • Did you modify, delete or move: file paths, file names, input names, output names, or task names?
  • If you made a changelog update, did you update the pipeline version number?

Copy link

Remember to squash merge!

Copy link

Remember to squash merge!

Copy link

Remember to squash merge!

Copy link
Contributor

@kayleemathews kayleemathews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs are updated!

Copy link
Contributor

@nikellepetrillo nikellepetrillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! i kicked of the scientific reblock tests off of this branch and they succeeded 🎉 https://gotc-jenkins.dsp-techops.broadinstitute.org/job/warp-workflow-tests/71142/console

@meganshand
Copy link
Contributor Author

Thank you! @jessicaway Do I need one more review before this can be merged?

Copy link

github-actions bot commented Apr 3, 2024

Remember to squash merge!

@nikellepetrillo nikellepetrillo merged commit a4aa631 into develop Apr 3, 2024
6 of 7 checks passed
@meganshand meganshand deleted the ms_reblock_intervals branch April 3, 2024 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants