Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

"reference seqeunces contain ambiguous bases not found in the provided alphabet and will be skipped." #120

Open
swZhang1 opened this issue Nov 27, 2021 · 1 comment

Comments

@swZhang1
Copy link

Hello,
It seems that something goes wrong with the 'prepare_mapped_reads.py' script but with no interruption.
Exactly, the inut reads of my fast5 and referefce_reads.fastq contains about 800k items,while over 780k reads was filtered out because they contain the N base in ref. sequence. The command logs are as below:

log:
	Running prepare_mapping using flip-flop remapping
	Converting references to labels using canonical alphabet ACGT and no modified bases
	* 782465 reference seqeunces contain ambiguous bases not found in the provided alphabet and will be skipped.
	* 44132 reads mapped successfully
	* 1079868 reads failed to produce remapping results due to: No fasta reference found.

so, only 44132 reads successfully processed and saved to the mapped_reads.fast5 file.

How can i do with this?

Looking forward to your advice

Kerry, 20211127

@swZhang1
Copy link
Author

swZhang1 commented Nov 27, 2021

BTW, my commandline history is :

taiyaki-master/bin/prepare_mapped_reads.py --jobs 1 fast5/ read_params.tsv mapped_reads.hdf5 taiyaki_walkthrough/pretrained/r941_rna_minion.checkpoint2 read_references.fasta

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant