📖Full documentation on GitHub Pages📖
utia-gc/ngs is a Nextflow pipeline for base NGS analysis.
While utia-gc/ngs can be run on any platform supported by Nextflow, it is developed for use in HPC environments and specifically [ISAAC Next Generation] at the University of Tennessee, Knoxville.
Warning
It is usually not a good idea to directly run utia-gc/ngs. This pipeline is designed to be a starting point for pipelines dedicated to specific analyses. It is generally not meant to be run itself. It will not often have scheduled or tagged releases, and as such it cannot reliably be used reproducibly. In nearly all cases, users should check the forked repos to find a pipeline built from utia-gc/ngs that is more suited to their needs or fork the repo to create their own versioned releases of a pipeline built on utia-gc/ngs.
flowchart LR
%% list all the input files
samplesheet>"Samplesheet"]
adapter_fasta>"
Adapter
Fasta
"]
genome_fasta>"
Genome
FASTA
"]
annotations_gtf>"
Annotations
GTF
"]
%% list all the internal Nextflow channels
raw_reads[("
Raw
reads
")]
prealign_reads[("
Prealign
reads
")]
trim_log[("
Trim
log
")]
individual_alignments[("
Individual
alignments
")]
merged_alignments[("
Merged
alignments
")]
%% list all the Nextflow processes
fastp{"fastp"}
cutadapt{"cutadapt"}
fastqc{"FastQC"}
seq_depth{"
Sequencing
Depth
"}
bwa_mem2{"bwa-mem2"}
STAR{"STAR"}
samtools_sort{"
samtools
sort
index
"}
gatk_MergeSamFiles{"
gatk
MergeSamFiles
"}
gatk_MarkDuplicates{"
gatk
MarkDuplicates
"}
samtools_idxstats{"
samtools
idxstats
"}
samtools_flagstat{"
samtools
flagstat
"}
samtools_stats{"
samtools
stats
"}
%% list all subgraphs for Nextflow subworkflows/workflows with options
subgraph inputs["Input Files"]
samplesheet
adapter_fasta
genome_fasta
annotations_gtf
end
subgraph trim_reads["Trim Reads"]
fastp
cutadapt
end
subgraph map_reads["Map Reads"]
bwa_mem2
STAR
end
subgraph publish_reports["Publish Reports"]
reads_mqc
alignments_mqc
full_mqc
end
subgraph publish_data["Publish Data"]
alignments
end
%% list all the published reports files
reads_mqc((("
Reads
MultiQC
")))
alignments_mqc((("
Alignments
MultiQC
")))
full_mqc((("
Full MultiQC
")))
%% list all the published data files
alignments[["
Alignments
"]]
%% reads processing workflow
samplesheet --> raw_reads
adapter_fasta --- fastp
raw_reads --- trim_reads --> prealign_reads
%% reads QC workflow
raw_reads --- fastqc --x reads_mqc
prealign_reads --- fastqc --x reads_mqc
trim_reads --> trim_log --x reads_mqc
raw_reads --- seq_depth --x reads_mqc
prealign_reads --- seq_depth --x reads_mqc
%% reads mapping workflow
genome_fasta --- map_reads
annotations_gtf --- map_reads
prealign_reads --- map_reads
%% alignments processing workflow
map_reads --- samtools_sort --> individual_alignments
individual_alignments --- gatk_MergeSamFiles --- gatk_MarkDuplicates --> merged_alignments
merged_alignments --x alignments
%% alignments QC workflow
individual_alignments --- samtools_idxstats --x alignments_mqc
individual_alignments --- samtools_flagstat --x alignments_mqc
merged_alignments --- samtools_stats --x alignments_mqc
%% Full MultiQC
reads_mqc --x full_mqc
alignments_mqc --x full_mqc
-
Any POSIX compatible system (e.g. Linux, OS X, etc) with internet access
- Run on Windows with Windows Subsystem for Linux (WSL). WSL2 highly recommended.
-
Nextflow version >= 21.04
- See Nextflow Get started for prerequisites and instructions on installing and updating Nextflow.
-
Download or update
utia-gc/ngs:nextflow pull utia-gc/ngs
-
Show project info:
nextflow info utia-gc/ngs
-
Check that
utia-gc/ngsworks on your system:-profile nf_testuses preconfigured test parameters to runutia-gc/ngsin full on a small test dataset stored in a remote GitHub repository.- Because these test files are stored in a remote repository, internet access is required to run the test.
- For more information, see the
profilessection of the nextflow config file.
nextflow run utia-gc/ngs \ -revision main \ -profile nf_test
Important
In accordance with best practices for reproducible analysis, always use the -revision option in nextflow run to specify a tagged and/or released version of the pipeline.
TODO