Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncaught OutOfIndexError when a UMapped read is positioned beyond the chromosome's length. #13

Closed
MaelLefeuvre opened this issue Jul 19, 2023 · 0 comments · Fixed by #14
Assignees
Labels
bug Something isn't working

Comments

@MaelLefeuvre
Copy link
Owner

Bug description

Attempting to apply pmd-mask on a file containing reads whose CIGAR exceeds the length specified in the LN field of the corresponding @sq header causes a runtime runtime panic.

Exception first encountered by @J-Sauvage , while applying pmd-mask on sample MX210 ( Furtwangler, 2020)

A description of these "spurious" BAM records can be found in the SAM format specification -page 10:

If POS plus the sum of lengths of M/=/X/D/N operations in CIGAR exceeds the length specified in
the LN field of the @sq header line (if exists) with an SN equal to RNAME, the alignment should
be unmapped, unless the reference sequence is circular (see below)

Minimally reproducible example:

Run pmd-mask on a file containing the following line, with GRCh37 as a reference:

SRR11179329.7101785     4       Y       59373566        37      36M     *       0       0       GGATCACAGGTCTATCACCCTATTAACCACTCACGG    AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ    X0:i:1  X1:i:0  MD:Z:0C35       PG:Z:MarkDuplicates     RG:Z:MX210.sort XG:i:0  NM:i:1  XM:i:1  XN:i:1  XO:i:0  XT:A:U

Note that the expected length of chromosome Y on GRCh37 is exactly 59373566.

Output with TRACE log-level is:

[2023-07-19T16:04:52 TRACE pmd_mask] ---- Inspecting record: Y + 59373565
[2023-07-19T16:04:52 TRACE pmd_mask] Relevant thresholds: (5p: 2bp) (3p: 3bp)
[2023-07-19T16:04:52 TRACE pmd_mask] CIGAR              : 36M
[2023-07-19T16:04:52 TRACE pmd_mask] Reference: N
[2023-07-19T16:04:52 TRACE pmd_mask] Sequence : GGATCACAGGTCTATCACCCTATTAACCACTCACGG
thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 34', /data/mlefeuvre/dev/aDNA-pipeline/workflow/scripts/pmd-mask/src/lib.rs:25:12
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Aborted (core dumped)
@MaelLefeuvre MaelLefeuvre added the bug Something isn't working label Jul 19, 2023
MaelLefeuvre added a commit that referenced this issue Jul 19, 2023
@MaelLefeuvre MaelLefeuvre linked a pull request Aug 8, 2023 that will close this issue
@MaelLefeuvre MaelLefeuvre self-assigned this Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant