Open
Description
In GitLab by @Huite on Jun 16, 2023, 16:59
Talking to Janneke just now, a script was taking an unnecessarily large amount of memory to run. This would be a nice case to demonstrate in a user guide piece of documentation.
In particular, the story here is that its best to move to the smallest amount of data, as soon as possible:
- The upper active cells were determined on all times rather than on the first or last timestep
- This can be used to reduce the number of layers before computing a mean (or other reduction) in time
We can show the different approach and illustrate (e.g. through memory usage, runtime). E.g. compute the mean first, then taking the upper active layer, show what happen if you chunk over time, or not; what bad chunking decisions do, etc.
This unidata post is also a nice resource:
https://www.unidata.ucar.edu/blogs/developer/entry/chunking_data_why_it_matters