Skip to content

SMD-Bioinformatics-Lund/nextflow_microwgs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nextflow pipeline for typing and marker detection of bacteria

Purpose

The pipeline is aimed at producing data useful for epidemiological and surveillance purposes. In v1 the pipeline is only tested using MRSA, but it should work well with any bacteria having a good cgMLST scheme.

Components

QC

Species detection is performed using Kraken2 together with Bracken. The database used is a standard Kraken database built with kraken2-build --standard --db $DBNAME

Low levels of Intra-species contamination or erronous mapping is removed using bwa and filtering away the heterozygous mapped bases.

Genome coverage is estimated by mapping with bwa mem and using a bed file containing the cgMLST loci.

A value on the evenness of coverage is calculated as an interquartile range.

Epidemiological typing

For de novo asspembly SPAdes is used. QUAST is used for extraxting QC data from the assembly.

The cgMLST reference scheme used, is branched off cgmlst.net At the moment this fork is not synced back with new allele numbers. For extracting alleles chewBBACA is used. Number of missing loci is calculated and used as a QC parameter.

Traditional 7-locus MLST is calculated using mlst.

Virulence and resistance markers

ARIBA is used as the tool to detect genetic markes. The database for virulence markes is VFDB.

Report and visualisation

The QC data is aggregated in a web service CDM (repo coming) and the cgMLST is visualized using a web service cgviz that is combined with graptetree for manipulating trees (repo coming).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published