Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: More than one match to b'ASC.' / b'XSC.' #11

Open
moldovannorbert opened this issue Apr 4, 2022 · 1 comment
Open

ValueError: More than one match to b'ASC.' / b'XSC.' #11

moldovannorbert opened this issue Apr 4, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@moldovannorbert
Copy link

moldovannorbert commented Apr 4, 2022

When running xenomapper2 on bams mapped by bwa mem I get the following errors from pylazybam:

xenomapper2 v2.0rc1 --primary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenomapper/1_mapping/LP0051_08_L001_primary.bam --secondary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenoma$

Traceback (most recent call last):
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/bin/xenomapper2", line 8, in
sys.exit(main())
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/cli.py", line 193, in main
pair_counts, counts, writer = xenomap(primary_bam,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 884, in xenomap
forward_state, reverse_state = xenomap_states(primary_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 685, in xenomap_states
prim_f_AS, prim_f_XS = score_function(prim_f_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 314, in get_bamprimary_AS_XS
AS = AS_function(bamprimary[0])
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/pylazybam/tags.py", line 55, in get_AS
raise ValueError(
ValueError: More than one match to b'ASC.' was found in b'I\x01\x00\x00\x04\x00\x00\x00ASC\x01'<V\x17\x01\x00S\x00\x97\x00\x00\x00\x04\x00\x00\x00!SC\x01I\xff\xff\xffA01685:16:HGHWVDSX3:1:1258:25373:19100\x00p$

or

xenomapper2 v2.0rc1 --primary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenomapper/1_mapping/LP0051_07_L001_primary.bam --secondary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenoma$

Traceback (most recent call last):
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/bin/xenomapper2", line 8, in
sys.exit(main())
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/cli.py", line 193, in main
pair_counts, counts, writer = xenomap(primary_bam,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 884, in xenomap
forward_state, reverse_state = xenomap_states(primary_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 685, in xenomap_states
prim_f_AS, prim_f_XS = score_function(prim_f_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 315, in get_bamprimary_AS_XS
XS = XS_function(bamprimary[0])
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/pylazybam/tags.py", line 108, in get_XS
raise ValueError(
ValueError: More than one match to b'XSC.' was found in b'N\x01\x00\x00\x07\x00\x00\x00XSC\x02'<V\x1b\x01\x00S\x00\x97\x00\x00\x00\x07\x00\x00\x00QSC\x02b\xff\xff\xffA01685:16:HGHWVDSX3:1:1161:21965:27946\x00p$

I am mapping and sorting reads in the previous step by:

bwa mem {params.ref} {input}
-M
-t {threads}
2> {log} |
samtools sort
-n
-@ {threads}
-o {output} 2>> {log}

Intriguingly this pipeline worked with another batch of files already, but now it is throwing the above error.

Edit: I forgot to add, that we are talking about 150 bp PE seq of genomic data.

@genomematt genomematt self-assigned this Apr 5, 2022
@genomematt genomematt added the bug Something isn't working label Apr 5, 2022
@genomematt
Copy link
Owner

I have had a quick look and can't see the obvious problem of there being some other information in the string that gets picked up by the regex search.
If you use samtool view are these the first reads in the file?
The easiest way to proceed if you don't mind is to send me the files so I can reproduce the error. If you can use samtools view input.bam "chrom:start-end" > output.bam with some appropriate region to make a small file that reproduces that would be even better.
If you can't do that, could send me the full SAM for that specific read. You can extract this with samtools view file.bam | grep -m1 "A01685:16:HGHWVDSX3:1:1161:21965:27946" -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants