Skip to content

Saving the groups generated from groupby operation #5674

Closed as not planned
Closed as not planned
@digital-idiot

Description

@digital-idiot

Problem
Group by is an expensive operation. Therefore I want to store my dataset to disk in the form of groups from group by operation. My use case is concerned with the groups, for example I want to take advantage of the lazy loading and only want to load selected groups into memory and process them.

Preferred Solution

  • An additional parameter to pass the groups to cache when writing the dataset to disk
    Or
  • A separate function to write the dataset as a collection of groups to file.

Alternatives considered
Treating each group as separate dataset and writing each of them to separate file. This is not suitable if number of groups is large and each group is relatively very small.

Additional context
It would also be great if groupby operation is natively supported for multiple coordinates.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions