Skip to content

A pipeline to remove human DNA from long-read metagenomes and extract microbial reads.

Notifications You must be signed in to change notification settings

ayoraind/hDNA_removal_and_mapping_stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Workflow to remove Human DNA contaminants from microbial reads.

Usage


=======================================================================
 HUMAN DNA REMOVAL AND MAPPING STATISTICS TAPIR Pipeline version 1.0dev
=======================================================================
 The typical command for running the pipeline is as follows:
        nextflow run main.nf --reads "PathToReadFile(s)" --output_dir "PathToOutputDir" --reference_fasta "PathToRefFasta" 

        Mandatory arguments:
         --reads                        Query fastqz file of sequences you wish to supply as input (e.g., "/MIGE/01_DATA/01_FASTQ/T055-8-*.fastq.gz")
         --reference_fasta              fasta file to be used as reference (e.g., /path/to/GRCh38.primary_assembly.genome.fa)
         --output_dir                   Output directory to place output (e.g., "/MIGE/01_DATA/03_ASSEMBLY")
         
        Optional arguments:
         --help                         This usage statement.
         --version                      Version statement

Introduction

This pipeline maps reads against the human genome, removes human DNA contaminants from reads, estimates the proportion of reads that align with (and do not align with) the human genome, and calculates a few more descriptive statistics. A certain percentage of this pipeline was adapted from the NF Core's Minimap2 module.

Sample command

An example of a command to run this pipeline is:

nextflow run main.nf --reads "Sample_files/*.fastq.gz" --output_dir "test2" --reference_fasta "PathToRefFasta"

Word of Note

This is an ongoing project at the Microbial Genome Analysis Group, Institute for Infection Prevention and Hospital Epidemiology, Üniversitätsklinikum, Freiburg. The project is funded by BMBF, Germany, and is led by Dr. Sandra Reuter.

Authors and acknowledgment

The TAPIR (Track Acquisition of Pathogens In Real-time) team.

About

A pipeline to remove human DNA from long-read metagenomes and extract microbial reads.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published