Skip to content

Questions about strands #78

@bschilder

Description

@bschilder

@Zhihan-Leo-Liu had some questions about how GVL handles strand information.

I believe this also came up with @Al-Murphy recently, and I couldn't really answer it myself.

The GVL docs mention this about strands, but I don't really understand what it means.

How can I get multiple tracks/stranded data?
If you provide multiple tracks to gvl.write(), all of them can be returned simultaneously from the resulting Dataset and placed along the track axis, sorted by name. By default, a Dataset sets all tracks to active when opened. i.e. tracks have shape (batch, tracks, [ploidy], length).

In my own tests in the splicing functionality, I noticed that when I take the reverse complement of transcripts that fall on the negative strand ("-"), using the reverse_complement function in Biopython, the number of stop codons per sequence decreases (which is generally a good sign).

@Zhihan-Leo-Liu @Al-Murphy please direct any more specific questions you have to @d-laub here or via Slack.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions