Skip to content

torch nested tensor for handling of ragged data #36

@clessig

Description

@clessig

Is your feature request related to a problem? Please describe.

The WeatherGenerator works with input streams with data in various shapes and forms. This is currently dealt with by using list of lists of ... of tensors. This is complex and error prone, e.g. when moving data to the GPU. torch nested tensors address this case (this comes from LLMs where sentences have varying length, similar to varlen attention). Last year, this feature was riddled with bugs and ultimately unusuable. However, the implementation likely matured and it would be worth to revisit the usability of nested tensors.

CC @frazane

Describe the solution you'd like

No response

Describe alternatives you've considered

Alternatives are lists of lists of ... of tensors as currently implemented. This can be wrapped in classes but the additional code introduces a maintenance overhead.

Additional context

No response

Organisation

ECMWF

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    • Status

      Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions