Skip to content

Multiple sequences in a single file #37

Open
@Pigrenok

Description

Hello!

It is not clear from specification whether the GFF3 file should be sorted by seqid or not if multiple seqid present in a file.

I received a file where it is not the case, e.g. first there are lines of type gene for multiple seqids and then multiple nRNA lines with the same set of seqids and with parents of the genes described above.

The reader I use (Sci-Kit Bio read function) reads each occurrence of seqid as new name. If specific sequence ID it provided, it reads only the first record (I presume because it encounters different seqid after that).

So, my problem is that because it is not specified, I cannot understand is it reader's behaviour incorrect or it is being strict and correct and the file itself is formatted incorrectly?

Thank you very much for clarification.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions