#PCAWG DELLY Workflow
A Dockstore version of the DELLY and COV workflow used by the PCAWG project. This is a cleaned up version with PCAWG-specific components removed and designed to be easier to call through the use of the Dockstore. See Pancancer.info and the PCAWG ICGC portal for more information about this project and how to access its results. The underlying workflow has been written using SeqWare 1.1.1.
The workflow consists of two main tools:
DELLY is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome.
Genome-wide coverage analysis and Depth of Coverage plotting, comparing tumor-normal
Email Brian if you have questions. Joachim was the primary author.
- Joachim Weischenfeldt (primary workflow author) joachim.weischenfeldt@embl.de
- Ivica Letunic (Dockerfile) letunic@biobyte.de
- Brian O'Connor briandoconnor@gmail.com
- Solomon Shorser Solomon.Shorser@oicr.on.ca
- Denis Yuen denis.yuen@oicr.on.ca
You need Docker installed in order to perform this build.
cd delly_docker
docker build -t quay.io/pancancer/pcawg_delly_workflow:2.0.0 .
Alternatively, you can view the entry on Dockstore to use a pre-built image.
This workflow recommends:
- 16-32 cores
- 4.5G per core, so, ideally 72GB+ for 16 cores, for 32 cores 144GB+, on Amazon we recommend r3.8xlarge or r3.4xlarge
- 1TB of local disk space (depends on the input genome size)
Non-controlled access sample BAM files can be found here:
And the two needed reference files can be found here:
- https://s3.amazonaws.com/pan-cancer-data/pan-cancer-reference/genome.fa.gz
- https://s3.amazonaws.com/pan-cancer-data/pan-cancer-reference/hs37d5_1000GP.gc
For sample parameters (a sample Dockstore.json in the command below) see Dockstore.json. Make sure you customize this to reflect whatever local paths you downloaded and extracted the above sample BAM files to.
You can use the Dockstore command line to simplify calling this workflow. If you prefer to call the workflow directly using Docker see the output from the commands below. For a parameterization using test data see our sample Dockstore.json hosted in GitHub and the note above.
Usage:
# fetch CWL
$> dockstore tool cwl --entry quay.io/pancancer/pcawg_delly_workflow:2.0.0-cwl1.0 > Dockstore.cwl
# make a runtime JSON template and edit it
$> dockstore tool convert cwl2json --cwl Dockstore.cwl > Dockstore.json
# run it locally with the Dockstore CLI
$> dockstore tool launch --entry quay.io/pancancer/pcawg_delly_workflow:2.0.0-cwl1.0 --json Dockstore.json
- Carefully monitor the CPU, storage, and memory usage of this workflow. You may be able to use a smaller instance than what is recommended above.
- Be mindful of where on the filesystem you run the
dockstore launch
above. The working directory, where large files are downloaded to, is placed here.
These resources are based on a mirror of the original bitbucket repository to Github.