Show a warning if special fasta headers format is violated

In a large dataset of automatically downloaded sequences there can be names including "|" symbol. 
I concatenate class and train/test labels also automatically.
So, when I try to analyze this file, there are uninformative error messages like:

* ValueError: could not convert string to float: 'P42577.2'
* ValueError: invalid literal for int() with base 10: '6LPD'

which are caused by incorrect fasta headers:

* P42577.2_sp\|P42577.2\|FRIS_LYMST\|0\|training
* 6LPD_pdb\|6LPD\|F\|1\|training

A simple check when importing the file could show a warning to the user.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Show a warning if special fasta headers format is violated #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Show a warning if special fasta headers format is violated #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions