Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

This directory contains simple snakemake workflows for running UShER, RIPPLES, matUtils and Augur

To use these workflows, include in the current working directory:

  1. a fasta file with SARS-CoV-2 genome sequences: [user_fa]
  2. the Snakefile
  3. the conda environment files, usher.yaml and nextstrain.yaml

Users can run each workflow as:

UShER: add samples to the latest public MAT

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="usher"

matUtils: extract subtrees in auspice.us compatible json format using matUtils

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="matUtils"

taxodium: view the "big tree" in the taxodium (taxodium.org) tree viewing platform

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="taxodium" 

translate: output lineage-aware translation for each amino acid substitution

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="translate" 

RIPPLES: detect recombinants in the ancestry of the user-supplied samples

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="ripples"

introduce: search for unique introductions within the user-supplied samples

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="introduce"

systematic: search for possible systematic errors in your added samples by outputing a list of sites whose parsimony score increased

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="systematic"

outbreak: run extract on the dataset that includes user provided samples to identify close related sequences

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="outbreak"

augur: runs the augur pipeline to build a clocked tree that includes the user samples for visualization in auspice

snakemake --use-conda --cores [num threads] --config FASTA="[user_fa]" RUNTYPE="augur"

Note that adding "-d [run_dir]" to the command line above will generate all output files in the specified directory. To do this, you must provide the full path to the fasta file or place the fasta file into the specified run directory.

Quick Tutorial:

To run a file through any of the workflows, with the exception of introduce, replace user_fa in the command line with "data/example.fasta". To run a file through the introduce workflow, use "data/introduce_example.fasta".

Augur workflow

The augur workflow produces a json file with sequences provided by the user that are currently published on NCBI and labeled with their NCBI ID. The user should alter the config.json file within the config folder to describe their dataset, as the current file is set as an example of the workflow.

For further help with the Augur workflow please contact Adriano Schneider.

Further Reading:

More information about each of these utilities can be found here: https://usher-wiki.readthedocs.io/en/latest/