Skip to content

[Enhancement]: Performance benchmarking #729

Open
@tomvothecoder

Description

@tomvothecoder

Is your feature request related to a problem?

We should do performance benchmarks with xCDAT across different workflows and data types/sizes.

Examples include (WIP):

  • High resolution data

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

Two part problem with performance:

  1. User needs to chunk their datasets appropriately for optimal performance, based on the shape of the data
    • This is often the bottleneck for most users. Not trivial to understand the optimal chunk sizes.
    • Also setting a cluster is better for performance monitoring, but another barrier to entry.
    • Is there a general way to chunk across different datasets of varying sizes/dimensions?
  2. Is xCDAT optimized for work on these chunks? -- We use Xarray APIs for core operations (e.g., grouping), which operate in parallel on Dask Arrays

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    • Status

      Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions