-
Notifications
You must be signed in to change notification settings - Fork 64
Open
Labels
Description
Not the NanoSim bug as such, but very relevant for Nanopore data. Samtools limits the read header to 254 characters (https://github.com/samtools/samtools/issues/10810). NanoSim (v3.2.2) doesn't seem to check that call to minimap2 in "read_analysis.py" complete without error. So when the input read file has read headers with >254 characters, samtools fails silently and the NanoSim continues running and throws error:
./output/test_genome_alnm.bam
Traceback (most recent call last):
File "/home/lshas17/miniforge3/envs/nanosimENV/bin/read_analysis.py", line 896, in <module>
main()
File "/home/lshas17/miniforge3/envs/nanosimENV/bin/read_analysis.py", line 606, in main
alnm_ext, unaligned_length, strandness, unaligned_base_qualities = align_genome(in_fasta, prefix, aligner,
File "/home/lshas17/miniforge3/envs/nanosimENV/bin/read_analysis.py", line 199, in align_genome
unaligned_length, strandness, unaligned_base_quals = get_primary_sam.primary_and_unaligned(g_alnm, prefix, quantification, fastq=fastq)
File "/home/lshas17/miniforge3/envs/nanosimENV/bin/get_primary_sam.py", line 188, in primary_and_unaligned
strandness = float(pos_strand) / num_aligned
ZeroDivisionError: float division by zero
Adding check of return code to calls to minimap2 (for example on line 171 in "read_analysis.py") will help users fix the problem with their data.