Skip to content

Add cle_annotated_vcf_filter step and strelka_cpu_reserved parameter#206

Merged
dufeiyu merged 9 commits intogenome:toil_compatibilityfrom
dufeiyu:toil_compatibility
Oct 13, 2017
Merged

Add cle_annotated_vcf_filter step and strelka_cpu_reserved parameter#206
dufeiyu merged 9 commits intogenome:toil_compatibilityfrom
dufeiyu:toil_compatibility

Conversation

@dufeiyu
Copy link
Member

@dufeiyu dufeiyu commented Oct 11, 2017

This PR is needed for several changes mentioned in https://jira.gsc.wustl.edu/browse/ITDEV-5037 before IT can release the IDT exome assay. It relies on genome/docker-cle#50
This change has been tested ok in wdl_toil hybrid workflow run. @jasonwalker80 can you review it ?

Copy link
Member

@jasonwalker80 jasonwalker80 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment in genome/docker-cle#50. I think renaming the workflow steps and the perl script name to something more descriptive would avoid a lot of confusion,ex. filter_noncoding_indels or something like that.

ramMin: 4000
arguments:
[ { valueFrom: $(runtime.cores), position: 1 },
[ { valueFrom: $(inputs.cpu_num), position: 1 },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand the issue, the problem is with LSF scheduling the jobs with 8 cores requested. This change will only modify the number passed to the strelka command. Instead of modifying the number of cpus strelka expects, don't you want to modify the number of CPUs requested? See line9 for the resource requirements used when scheduling the LSF job.

Changing the number of CPUs that strelka thinks it has available will result in twice the run time as before.

type: boolean?
hgvs_annotation:
type: boolean?
filter_annotated_vcf:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this workflow step is a bit ambiguous. What is being filtered? There are other steps that are hard or soft filtering.

Copy link
Member

@jasonwalker80 jasonwalker80 Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about cle_filter to make this more generic, but specific to the CLE assay.

secondaryFiles: [.tbi]
strelka_exome_mode:
type: boolean
strelka_cpu_num:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be something like strelka_cpu_reserved or min_strelka_cpus? Do you want to go ahead and parameterize the number of threads/CPU strelka thinks it's using in the command line?

class: CommandLineTool
label: "annotated_vcf_filter"
arguments: [
"/usr/bin/perl", "/usr/bin/annotated_vcf_filter.pl",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call docm_and_coding_indel_selection.pl

vcf: bgzip/bgzipped_file
out:
[indexed_vcf]
annotated_vcf_filter:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cle_filtered_vcf this is or is not filtered depending on the cle_filter boolean flag.

type: boolean
strelka_cpu_num:
type: int?
default: 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change default to 8 since this value is passed to the strelka runWorkflow.py

@@ -9,7 +9,7 @@ requirements:
coresMin: 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since lookup values and variables may not work in the resourceRequirements section we could hard-code this to 4.

@dufeiyu dufeiyu changed the title Add annotated_vcf_filter step and strelka_cpu_num parameter Add cle_annotated_vcf_filter step and strelka_cpu_reserved parameter Oct 13, 2017
@dufeiyu
Copy link
Member Author

dufeiyu commented Oct 13, 2017

@jasonwalker80 Changes were made per review. wdl-toil test on 10% downsampling HCC1395 is fine. Can you check again ? Thanks.

Copy link
Member

@jasonwalker80 jasonwalker80 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@dufeiyu dufeiyu merged commit ead5c70 into genome:toil_compatibility Oct 13, 2017
@dufeiyu dufeiyu deleted the toil_compatibility branch October 13, 2017 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants