-
Notifications
You must be signed in to change notification settings - Fork 1
Workflows
Common workflows are described here.
Downloads the files specified in a sample
Inputs: data.csv
Outputs: nanopore-reads, illumina-reads
Options: --config data_csv=<path> (Default is data.csv)
--resources connections=<int>
The connections resource controls how many simultaneous download jobs will be used. By default it is 1. Be careful to not make it too high and overload your system!
Merges the raw reads corresponding to each sample into one file per type of read.
Inputs: data.csv
Outputs: merged-reads
Options: <same options as download-data>
--config breseq_options="<breseq_options>"
Merges the trimmed reads corresponding to each sample into one file per type of read.
Inputs: data.csv
Outputs: merged-reads-trimmed
Options: <same options as download-data>
--config breseq_options="<breseq_options>"
Runs breseq using the reference files and trimmed read files.
Inputs: data.csv
Outputs: breseq-references/data, breseq-references/html, breseq-references/gd
Options: <same options as download-data>
--config BRESEQ_OPTIONS="<breseq_options>"
Options that get passed to breseq
--config BRESEQ_THREADS=<int>
Override the default number of threads for each breseq job.
--config No_DEFAULT_BRESEQ_OPTIONS=<bool>
Don't pass the default option of -x to breseq when using nanopore reads
Runs breseq using the reference files and trimmed read files. Then runs breseq CL-TABULATE on the aligned reads to create a CSV file that counts how many reads have different numbers of bases in each mononucleotide repeat with at least a certain minimum length in the reference file.
Inputs: data.csv
Outputs: breseq-references/ssrs
Options: <same options as download-data>
--config ssr_minimum_length=<int>
Minimum length (--minimum-length) parameter passed to `breseq CL-TABULATE`
--config ssr_strict_mode=<bool>
Pass the `--strict` parameter to `breseq CL-TABULATE`.
Runs predict-mutations-breseq and then generates HTML compare tables to summarize similarities and differences between samples. Different compare table files are created for each set of samples that were compared against different reference sequences.
Inputs: data.csv
Outputs: breseq-references/compare[_#].html, breseq-references/html, breseq-references/gd
Options: <same options as predict-mutations-breseq>
Runs breseq BAM2COV to create coverage plots tiling the reference genome.
inputs: breseq-references/data
Outputs: breseq-references/cov
Options: <same options as predict-mutations-breseq>
Uses gdtools from breseq to apply the GenomeDiff files in genome_diff to generate updated reference genomes that include those mutations. One GenomeDiff file is expected per sample with the *.gd file ending. These could be copied from a breseq-*/gd directory and then manually edited to curate the mutations they describe.
Note: Currently, this only works for the reference file in gff3 format. It will fail if you provide it with a gbk or fasta format reference file.
Inputs: data.csv, genome-diffs/*.gd
Outputs: mutants
Options: <same options as download-data>
You can use predict-mutations-breseq-mutants after this command to re-run breseq using the input reads against the hypothesized mutant genome sequences. If their lists of mutations are correct and complete the output should now show no mutations predicted.
Generates files that can be loaded in IGV to view sequences (FASTA/FAI), reads (BAM/BAI) and annotations (GFF). Runs minimap2 for nanopore reads and bowtie2 for illumina reads for mapping to the provided reference.
Inputs: data.csv
Outputs: align-reads-references/data
Options: <same options as download-data>
Analyzes and plots soft-clipped reads after mapping.
Inputs: align-reads-references/data
Outputs: align-reads-reference/soft-clipping
Options: <same options as download-data>
Combines annotations of genes from prokka with annotations of IS elements from isescan into a final Genbank file for each sample.
Inputs: data.csv, references
Outputs: annotated-references
Options: <same options as download-data>
- Autocycler is not available on bioconda. Download the release for your OS from the Autocycler GitHub.
- DO NOT download the binary into the
brefitofolder. This interferes with the execution of Snakemake. - Add the path to the folder that contains the
autocyclerbinary to your $PATH variable. - Clone the Autocycler repository anywhere on your system. DO NOT clone it into the
brefitofolder.
git clone https://github.com/rrwick/Autocycler.git
- Add the path to the
scripts/folder of this repository to your $PATH.
Use autocycler to generate a consensus assembly for each sample.
Inputs: data.csv
Outputs: autocycler/{sample}/output/consensus_assembly.fasta
Options: --config genome_size=<int>
required step to supply estimated genome size (eg: 4600000)