Skip to content

Optimization needed for regrid (including esmpy) and level selection #775

@valeriupredoi

Description

@valeriupredoi

This is an amalgamation of multiple issues that were open at separate times, initially believed to be caused by other problems; issues closed and squashed in here #724 #774 and draft PR #773 ; in short:

  • regridding large 3d datasets (like ocean data) takes a long time; this is not due to poor implementation but rather it is the limit of any serial run w/o parallelization; @bouweandela has implemented lazy regridding that will be available in iris (well done man!) but that will not make things much faster if we don't parallelize the regridding at least on vertical levels;
  • level selection currently consumes a lot of memory - it is not lazy and is done via loading the full data object into memory several times, as many levels there are;

I think we need to start thinking about this and talk about it; I have already tried to parallelize the esmpy 3d regridding but it's difficult due to the amount of complexity that needs to be passed to the parallel process. Anyway, I will have a look at the level selection imminently, but for now this is a placeholder for @rswamina @omeuriot @ledm and others working on ocean data that things will improve, just bear with us 🍺

Metadata

Metadata

Labels

enhancementNew feature or requestpreprocessorRelated to the preprocessor

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions