Skip to content

Commit

Permalink
restore FreeBayes
Browse files Browse the repository at this point in the history
  • Loading branch information
maxulysse committed Dec 3, 2019
1 parent b0a6557 commit 9ab028d
Show file tree
Hide file tree
Showing 10 changed files with 83 additions and 14 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci-extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
runs-on: ubuntu-18.04
strategy:
matrix:
tool: [Haplotypecaller, Manta, mpileup, Mutect2, Strelka]
tool: [Haplotypecaller, Freebayes, Manta, mpileup, Mutect2, Strelka]
steps:
- uses: actions/checkout@v1
- name: Install Nextflow
Expand Down
1 change: 0 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
### `Removed`

- [#46](https://github.com/nf-core/sarek/pull/46) - Remove mention of old `build.nf` script which was included in `main.nf`
- [#XXX](https://github.com/nf-core/sarek/pull/XXX) - Remove `Freebayes`

### `Fixed`

Expand Down
2 changes: 2 additions & 0 deletions bin/scrape_software_versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
'bcftools': ['v_bcftools.txt', r"bcftools (\S+)"],
'BWA': ['v_bwa.txt', r"Version: (\S+)"],
'FastQC': ['v_fastqc.txt', r"FastQC v(\S+)"],
'FreeBayes': ['v_freebayes.txt', r"version: v(\d\.\d\.\d+)"],
'GATK': ['v_gatk.txt', r"Version:(\S+)"],
'htslib': ['v_samtools.txt', r"htslib (\S+)"],
'Manta': ['v_manta.txt', r"([0-9.]+)"],
Expand All @@ -32,6 +33,7 @@
results['bcftools'] = '<span style="color:#999999;\">N/A</span>'
results['BWA'] = '<span style="color:#999999;\">N/A</span>'
results['FastQC'] = '<span style="color:#999999;\">N/A</span>'
results['FreeBayes'] = '<span style="color:#999999;\">N/A</span>'
results['GATK'] = '<span style="color:#999999;\">N/A</span>'
results['htslib'] = '<span style="color:#999999;\">N/A</span>'
results['Manta'] = '<span style="color:#999999;\">N/A</span>'
Expand Down
1 change: 1 addition & 0 deletions docs/containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ For annotation, the main container can be used, but the cache has to be download
- Contain **[BWA](https://github.com/lh3/bwa)** 0.7.17
- Contain **[Control-FREEC](https://github.com/BoevaLab/FREEC)** 11.5
- Contain **[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/)** 0.11.8
- Contain **[FreeBayes](https://github.com/ekg/freebayes)** 1.3.1
- Contain **[GATK4](https://github.com/broadinstitute/gatk)** 4.1.4.0
- Contain **[GeneSplicer](https://ccb.jhu.edu/software/genesplicer/)** 1.0
- Contain **[HTSlib](https://github.com/samtools/htslib)** 1.9
Expand Down
Binary file modified docs/images/sarek_workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 18 additions & 4 deletions docs/images/sarek_workflow.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 14 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ The pipeline processes data using the following steps:
* `GATK ApplyBQSR`
2. [**Variant calling**](#Variant-Calling)
* SNVs and small indels
* [`FreeBayes`](#FreeBayes)
* [`GATK HaplotypeCaller`](#HaplotypeCaller)
* [`GATK GenotypeGVCFs`](#GenotypeGVCFs)
* [`GATK Mutect2`](#Mutect2)
Expand Down Expand Up @@ -99,6 +100,18 @@ All the results regarding variant-calling are collected in this directory.

Recalibrated BAM files can also be used as an input to start the Variant Calling, for more information see [TSV files output information](#TSV-files)

### FreeBayes

[FreeBayes](https://github.com/ekg/freebayes) is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment..

For further reading and documentation see the [FreeBayes manual](https://github.com/ekg/freebayes/blob/master/README.md#user-manual-and-guide).

For a Tumor/Normal pair only:
**Output directory: `results/VariantCalling/[TUMOR_vs_NORMAL]/FreeBayes`**

* `FreeBayes_[TUMORSAMPLE]_vs_[NORMALSAMPLE].vcf.gz` and `FreeBayes_[TUMORSAMPLE]_vs_[NORMALSAMPLE].vcf.gz.tbi`
* VCF with Tabix index

### HaplotypeCaller

[GATK HaplotypeCaller](https://github.com/broadinstitute/gatk) calls germline SNPs and indels via local re-assembly of haplotypes.
Expand Down Expand Up @@ -318,6 +331,7 @@ For a Tumor/Normal pair only:

This directory contains results from the final annotation steps: two software are used for annotation, [snpEff](http://snpeff.sourceforge.net/) and [VEP](https://www.ensembl.org/info/docs/tools/vep/index.html).
Only a subset of the VCF files are annotated, and only variants that have a PASS filter.
FreeBayes results are not annotated in the moment yet as we are lacking a decent somatic filter.
For HaplotypeCaller the germline variations are annotated for both the tumor and the normal sample.

### snpEff
Expand Down
2 changes: 1 addition & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ Available: `mapping`, `recalibrate`, `variantcalling` and `annotate`
### `--tools`

Use this to specify the tools to run:
Available: `ASCAT`, `ControlFREEC`, `HaplotypeCaller`, `Manta`, `mpileup`, `Mutect2`, `Strelka`, `TIDDIT`
Available: `ASCAT`, `ControlFREEC`, `FreeBayes`, `HaplotypeCaller`, `Manta`, `mpileup`, `Mutect2`, `Strelka`, `TIDDIT`

### `--noStrelkaBP`

Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ dependencies:
- control-freec=11.5
- ensembl-vep=98.2
- fastqc=0.11.8
- freebayes=1.3.1
- gatk4=4.1.4.0
- genesplicer=1.0
- htslib=1.9
Expand Down
52 changes: 45 additions & 7 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def helpMessage() {
Available: Mapping, Recalibrate, VariantCalling, Annotate
Default: Mapping
--tools Specify tools to use for variant calling:
Available: ASCAT, ControlFREEC, HaplotypeCaller
Available: ASCAT, ControlFREEC, FreeBayes, HaplotypeCaller
Manta, mpileup, Mutect2, Strelka, TIDDIT
and/or for annotation:
snpEff, VEP, merge
Expand Down Expand Up @@ -348,6 +348,7 @@ process GetSoftwareVersions {
echo "${workflow.nextflow.version}" &> v_nextflow.txt 2>&1 || true
echo "SNPEFF version"\$(snpEff -h 2>&1) > v_snpeff.txt
fastqc --version > v_fastqc.txt 2>&1 || true
freebayes --version > v_freebayes.txt 2>&1 || true
gatk ApplyBQSR --help 2>&1 | grep Version: > v_gatk.txt 2>&1 || true
multiqc --version &> v_multiqc.txt 2>&1 || true
qualimap --version &> v_qualimap.txt 2>&1 || true
Expand Down Expand Up @@ -1302,8 +1303,44 @@ intervalPairBam = pairBam.spread(bedIntervals)

bamMpileup = bamMpileup.spread(intMpileup)

// intervals for Mutect2 calls and pileups for Mutect2 filtering
(pairBamMutect2, pairBamPileupSummaries) = intervalPairBam.into(2)
// intervals for Mutect2 calls, FreeBayes and pileups for Mutect2 filtering
(pairBamMutect2, pairBamFreeBayes, pairBamPileupSummaries) = intervalPairBam.into(3)

// STEP FREEBAYES

process FreeBayes {
tag {idSampleTumor + "_vs_" + idSampleNormal + "-" + intervalBed.baseName}
label 'cpus_1'

input:
set idPatient, idSampleNormal, file(bamNormal), file(baiNormal), idSampleTumor, file(bamTumor), file(baiTumor), file(intervalBed) from pairBamFreeBayes
file(fasta) from ch_fasta
file(fastaFai) from ch_fastaFai

output:
set val("FreeBayes"), idPatient, val("${idSampleTumor}_vs_${idSampleNormal}"), file("${intervalBed.baseName}_${idSampleTumor}_vs_${idSampleNormal}.vcf") into vcfFreeBayes

when: 'freebayes' in tools

script:
"""
freebayes \
-f ${fasta} \
--pooled-continuous \
--pooled-discrete \
--genotype-qualities \
--report-genotype-likelihood-max \
--allele-balance-priors-off \
--min-alternate-fraction 0.03 \
--min-repeat-entropy 1 \
--min-alternate-count 2 \
-t ${intervalBed} \
${bamTumor} \
${bamNormal} > ${intervalBed.baseName}_${idSampleTumor}_vs_${idSampleNormal}.vcf
"""
}

vcfFreeBayes = vcfFreeBayes.groupTuple(by:[0,1,2])

// STEP GATK MUTECT2.1 - RAW CALLS

Expand Down Expand Up @@ -1392,9 +1429,9 @@ process MergeMutect2Stats {
// we are merging the VCFs that are called separatelly for different intervals
// so we can have a single sorted VCF containing all the calls for a given caller

// STEP MERGING VCF - GATK HAPLOTYPECALLER & GATK MUTECT2 (UNFILTERED)
// STEP MERGING VCF - FREEBAYES, GATK HAPLOTYPECALLER & GATK MUTECT2 (UNFILTERED)

vcfConcatenateVCFs = mutect2Output.mix(vcfGenotypeGVCFs, gvcfHaplotypeCaller)
vcfConcatenateVCFs = mutect2Output.mix(vcfFreeBayes, vcfGenotypeGVCFs, gvcfHaplotypeCaller)
vcfConcatenateVCFs = vcfConcatenateVCFs.dump(tag:'VCF to merge')

process ConcatVCF {
Expand All @@ -1413,7 +1450,7 @@ process ConcatVCF {
// we have this funny *_* pattern to avoid copying the raw calls to publishdir
set variantCaller, idPatient, idSample, file("*_*.vcf.gz"), file("*_*.vcf.gz.tbi") into vcfConcatenated

when: ('haplotypecaller' in tools || 'mutect2' in tools)
when: ('haplotypecaller' in tools || 'mutect2' in tools || 'freebayes' in tools)

script:
if (variantCaller == 'HaplotypeCallerGVCF')
Expand Down Expand Up @@ -2109,7 +2146,7 @@ if (step == 'annotate') {

if (tsvPath == []) {
// Sarek, by default, annotates all available vcfs that it can find in the VariantCalling directory
// Excluding g.vcf from HaplotypeCaller
// Excluding vcfs from FreeBayes, and g.vcf from HaplotypeCaller
// Basically it's: VariantCalling/*/{HaplotypeCaller,Manta,Mutect2,Strelka,TIDDIT}/*.vcf.gz
// Without *SmallIndels.vcf.gz from Manta, and *.genome.vcf.gz from Strelka
// The small snippet `vcf.minus(vcf.fileName)[-2]` catches idSample
Expand Down Expand Up @@ -2651,6 +2688,7 @@ def defineToolList() {
return [
'ascat',
'controlfreec',
'freebayes',
'haplotypecaller',
'manta',
'merge',
Expand Down

0 comments on commit 9ab028d

Please sign in to comment.