support different lengths for each type of sequence

A common setup for sequence to function modeling is using a larger window of genetic context as input to predict a smaller window of functional data. For reduced storage requirements and better performance, GVL could write and/or decompress just the data that is needed, rather than using a single length for all sequence types. I'm not sure what the best API would look like here. Users could ostensibly even want sequence that are not centered on each other, and then each type of sequence would need its own set of regions that are paired with all the others. This could look like passing a BED file for each reader, where each BED file has the same # of regions. Downstream, this would require expanding the definition of "output sequence length" described in the `gvl.Dataset`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support different lengths for each type of sequence #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

support different lengths for each type of sequence #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions