-
Notifications
You must be signed in to change notification settings - Fork 43
Assembly with megahit
Francisco Zorrilla edited this page Mar 22, 2021
·
3 revisions
The megahit
assembly rule is implemented in the main Snakefile as follows:
rule megahit:
input:
R1 = rules.qfilter.output.R1,
R2 = rules.qfilter.output.R2
output:
f'{config["path"]["root"]}/{config["folder"]["assemblies"]}/{{IDs}}/contigs.fasta.gz'
benchmark:
f'{config["path"]["root"]}/benchmarks/{{IDs}}.megahit.benchmark.txt'
shell:
"""
set +u;source activate {config[envs][metabagpipes]};set -u;
cd $SCRATCHDIR
echo -n "Copying qfiltered reads to $SCRATCHDIR ... "
cp {input.R1} {input.R2} $SCRATCHDIR
echo "done. "
echo -n "Running megahit ... "
megahit -t {config[cores][megahit]} \
--presets {config[params][assemblyPreset]} \
--verbose \
--min-contig-len {config[params][assemblyMin]} \
-1 $(basename {input.R1}) \
-2 $(basename {input.R2}) \
-o tmp;
echo "done. "
echo "Renaming assembly ... "
mv tmp/final.contigs.fa contigs.fasta
echo "Fixing contig header names: replacing spaces with hyphens ... "
sed -i 's/ /-/g' contigs.fasta
echo "Zipping and moving assembly ... "
gzip contigs.fasta
mkdir -p $(dirname {output})
mv contigs.fasta.gz $(dirname {output})
echo "Done. "
"""
- Quality filter reads with fastp
- Assembly with megahit
- Draft bin sets with CONCOCT, MaxBin2, and MetaBAT2
- Refine & reassemble bins with metaWRAP
- Taxonomic assignment with GTDB-tk
- Relative abundances with bwa
- Reconstruct & evaluate genome-scale metabolic models with CarveMe and memote
- Species metabolic coupling analysis with SMETANA