Skip to content

Unrecognized sequencing technology #9

@pdimens

Description

@pdimens

Hello,

I'm trying to use LEVIATHAN and require the barcode indices from LRez. Introduced a step in my workflow that uses samtools view to filter reads on mapping quality, and it seems that doing so has created issues with LRez no longer recognizing the BX:Z: tags (where it did previously). These are haplotagging data, where the index is AXXCXXBXXDXX.

$ LRez index bam -p -b 2A_3_221221_15x.bam -o test.bci
determineSequencingTechnology: Unrecognized sequencing technology. Please make sure your barcodes originate from a compatible technology or are reported as nucleotides in the BX:Z tag.

Unless I'm mistaken, my bam files are formatted normally:

$ samtools view -h 2A_3_221221_15x.bam | head -18
@HD     VN:1.6  SO:coordinate
@SQ     SN:2L   LN:23513712
@SQ     SN:2R   LN:25286936
@SQ     SN:3L   LN:28110227
@SQ     SN:3R   LN:32079331
@SQ     SN:4    LN:1348131
@SQ     SN:X    LN:23542271
@SQ     SN:Y    LN:3667352
@RG     ID:2A_3_221221_15x      SM:2A_3_221221_15x
@PG     ID:bwa  PN:bwa  CL:bwa mem -C -t 6 -M -R @RG\tID:2A_3_221221_15x\tSM:2A_3_221221_15x Assembly/Assembly/dmel.trunc.fa Trimming/2A_3_221221_15x.R1.fq.gz Trimming/2A_3_221221_15x.R2.fq.gz  VN:0.7.17-r1188
@PG     ID:samtools     PN:samtools     CL:samtools view -h -F 4 -q 30 -t Assembly/Assembly/dmel.trunc.fa.fai -T Assembly/Assembly/dmel.trunc.fa -   PP:bwa  VN:1.17
@PG     ID:samtools.1   PN:samtools     CL:samtools sort -T Alignments/bwa/2A_3_221221_15x --reference Assembly/Assembly/dmel.trunc.fa -O bam -m 4G -o Alignments/bwa/2A_3_221221_15x.sort.bam -  PP:samtools     VN:1.17
@PG     ID:sambamba     CL:markdup -t 4 -l 4 Alignments/bwa/2A_3_221221_15x.sort.bam Alignments/bwa/2A_3_221221_15x.bam      PP:samtools.1   VN:1.0
@PG     ID:samtools.2   PN:samtools     PP:sambamba     VN:1.17 CL:samtools view -h 2A_3_221221_15x.bam
A00470:481:HNYFWDRX2:1:2177:6090:16266  1123    2L      4831    40      80M     =       5088407      CCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGA    FF,FFFF::FFFFF:FFFF:FFF:FFFFFF:FFFFFFFFFFF:FFF:FFFFF,FFFF,F,FFFFFF:FFFFFFFFFFFFF NM:i:0  MD:Z:80      MC:Z:150M       AS:i:80 XS:i:80 RG:Z:2A_3_221221_15x    BX:Z:A95C26B84D96 1:N:0:TATCAGTA+TTACTACT
A00470:481:HNYFWDRX2:1:2207:9426:24267  1123    2L      4831    40      80M     =       5112407      CCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGA    FF:F,F,FFF,FFFF,FFFFF::FFFFFFF,FFFF:FFFFFFF:FFFFF:FFF:F:FF,FFF,FF:F,FF,:FFFF,:,F NM:i:0  MD:Z:80      MC:Z:22S126M    AS:i:80 XS:i:80 RG:Z:2A_3_221221_15x    BX:Z:A95C26B84D96 1:N:0:TATCAGTA+TTACTACT
A00470:481:HNYFWDRX2:1:2254:21902:6699  1123    2L      4831    40      80M     =       5088407      CCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGA    FFF,F:FFF,FFFFF,FFFFFF:FFFFFFF:FFFFFFFFFF:FFFFFF:FFFFFFFF,FFFF,F::,FFFFF,:FF,FFF NM:i:0  MD:Z:80      MC:Z:150M       AS:i:80 XS:i:80 RG:Z:2A_3_221221_15x    BX:Z:A95C26B84D96 1:N:0:TATCAGTA+TTACTACT
A00470:481:HNYFWDRX2:1:2273:24334:6433  1123    2L      4831    40      80M     =       5088407      CCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGA    FFFFFF,F:FFFFFFF:FF:F:F,FFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFF:FF,FFFF:FFFFFFF NM:i:0  MD:Z:80      MC:Z:150M       AS:i:80 XS:i:80 RG:Z:2A_3_221221_15x    BX:Z:A95C26B84D96 1:N:0:TATCAGTA+TTACTACT

Do you have insights to provide on this that may reveal a mistake on my end or a bug in LRez?
While the workflow is listed in the @PG tags, the steps are:

  1. map with bwa mem
  2. filter with samtools view
  3. sort and convert to bam with samtools sort
  4. mark duplicates with sambamba markdup

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions