- trim: Filter, trim and rename sequences in the fastq according to the used barcodes
- dedup: De-duplicate PCR duplicates from a sorted BAM file
Download the binary from releases
wget https://github.com/Avsecz/nimnexus/blob/master/nimnexus?raw=true -O nimnexus && chmod u+x nimnexus
Put it into a folder specified in $PATH.
$ nimnexus trim --help
nimnexus version:0.1.0
Trim the fastq reads
Usage: nimnexus trim [options] <barcode>
<barcode> Barcode sequences (comma-separated) that follow random barcode
-t --trim <int> Pre-trim all reads by this length before processing [default: 0]
-k --keep <int> Minimum number of bases required after barcode to keep read [default: 18]
-r --randombarcode <int> Number of bases at the start of each read used for random barcode [default: 5]
zcat input.fastq.gz | nimnexus trim -t 1 CTGA,TGAC,GACT,ACTG | gzip -c > output.fastq.gz
# Using pigz to (de-)compress in parallel
pigz -cd input.fastq.gz | nimnexus trim -t 1 CTGA,TGAC,GACT,ACTG | pigz -c > output.fastq.gz
zcat tests/data/mesc_pbx_raw_sample.fastq.gz | ./nimnexus trim -t 1 CTGA,TGAC,GACT,ACTG > /tmp/output.fastq
$ nimnexus dedup --help
nimnexus version:0.1.0
Remove duplicate reads from the sorted bam file
Usage: nimnexus dedup [options] <BAM>
<BAM> sorted BAM file
-t --threads <int> number of BAM decompression threads [default: 2]
nimnexus dedup -t 10 file.bam | samtools view -b > file.dedup.bam
Note: nimnexus dup
writes the output in the SAM format to stdout. Hence samtools view -b
is used to convert SAM->BAM.
- the original script written in R was implemented by by Melanie Weilert and Jeff Johnston. Repository: mlweilert/chipnexus-processing-scripts. Matching scripts:
nimnexus trim
<-> scripts/preprocess_fastqnimnexus dedup
<-> scripts/process_bam.r
- nim package template was taken from Brent Pedersen's bpbio nim package: https://github.com/brentp/bpbio