Add cle_annotated_vcf_filter step and strelka_cpu_reserved parameter#206
Conversation
There was a problem hiding this comment.
See comment in genome/docker-cle#50. I think renaming the workflow steps and the perl script name to something more descriptive would avoid a lot of confusion,ex. filter_noncoding_indels or something like that.
strelka/strelka.cwl
Outdated
| ramMin: 4000 | ||
| arguments: | ||
| [ { valueFrom: $(runtime.cores), position: 1 }, | ||
| [ { valueFrom: $(inputs.cpu_num), position: 1 }, |
There was a problem hiding this comment.
As I understand the issue, the problem is with LSF scheduling the jobs with 8 cores requested. This change will only modify the number passed to the strelka command. Instead of modifying the number of cpus strelka expects, don't you want to modify the number of CPUs requested? See line9 for the resource requirements used when scheduling the LSF job.
Changing the number of CPUs that strelka thinks it has available will result in twice the run time as before.
detect_variants/detect_variants.cwl
Outdated
| type: boolean? | ||
| hgvs_annotation: | ||
| type: boolean? | ||
| filter_annotated_vcf: |
There was a problem hiding this comment.
The name of this workflow step is a bit ambiguous. What is being filtered? There are other steps that are hard or soft filtering.
There was a problem hiding this comment.
how about cle_filter to make this more generic, but specific to the CLE assay.
detect_variants/detect_variants.cwl
Outdated
| secondaryFiles: [.tbi] | ||
| strelka_exome_mode: | ||
| type: boolean | ||
| strelka_cpu_num: |
There was a problem hiding this comment.
should this be something like strelka_cpu_reserved or min_strelka_cpus? Do you want to go ahead and parameterize the number of threads/CPU strelka thinks it's using in the command line?
| class: CommandLineTool | ||
| label: "annotated_vcf_filter" | ||
| arguments: [ | ||
| "/usr/bin/perl", "/usr/bin/annotated_vcf_filter.pl", |
There was a problem hiding this comment.
call docm_and_coding_indel_selection.pl
detect_variants/detect_variants.cwl
Outdated
| vcf: bgzip/bgzipped_file | ||
| out: | ||
| [indexed_vcf] | ||
| annotated_vcf_filter: |
There was a problem hiding this comment.
cle_filtered_vcf this is or is not filtered depending on the cle_filter boolean flag.
detect_variants/detect_variants.cwl
Outdated
| type: boolean | ||
| strelka_cpu_num: | ||
| type: int? | ||
| default: 4 |
There was a problem hiding this comment.
change default to 8 since this value is passed to the strelka runWorkflow.py
strelka/strelka.cwl
Outdated
| @@ -9,7 +9,7 @@ requirements: | |||
| coresMin: 8 | |||
There was a problem hiding this comment.
since lookup values and variables may not work in the resourceRequirements section we could hard-code this to 4.
|
@jasonwalker80 Changes were made per review. wdl-toil test on 10% downsampling HCC1395 is fine. Can you check again ? Thanks. |
This PR is needed for several changes mentioned in https://jira.gsc.wustl.edu/browse/ITDEV-5037 before IT can release the IDT exome assay. It relies on genome/docker-cle#50
This change has been tested ok in wdl_toil hybrid workflow run. @jasonwalker80 can you review it ?