Skip to content

BUG: interpolating Dask Array with NumPy Arrays completely blows up the chunk size for multiple dimensions #9907

Closed as not planned
@phofl

Description

@phofl

What happened?

Interpolating rechecks to -1 along the interpolation axis, doing this for many dimensions at once will blow up the chunk sizes :(
Screenshot 2024-12-18 at 14 54 31

This seems to happen when you put stuff into blockwise, I think we might want to rechunk the coordinates to the proper chunk size, but not sure

What did you expect to happen?

Keep chunk sizes consistent through rechunking the other dimensions appropriately I guess

@dcherian would your current work in this area impact this?

Minimal Complete Verifiable Example

import dask.array as da


import dask.array
import pandas as pd
import numpy as np

import xarray as xr

arr = xr.DataArray(
    da.random.random((1, 75902, 45910), chunks=(1, "auto", -1)),
    dims=["band", "y", "x"],
    coords={"x": np.linspace(-73.58, -62.11, 45910), "y": np.linspace(-36.08, -55.05, 75902)},
    name="bla",
)

arr2 = xr.DataArray(
    da.random.random((1, 75902, 45910), chunks=(1, "auto", -1)),
    dims=["band", "y", "x"],
    coords={"x": np.linspace(-73.58, -62.11, 45910), "y": np.linspace(-36.08, -55.05, 75902)},
    name="bla",
)

x = arr2.interp(
    x=arr.coords["x"],
    y=arr.coords["y"],
    method="linear",
)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.10 | packaged by conda-forge | (main, Oct 16 2024, 01:26:25) [Clang 17.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 23.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.3
libnetcdf: None

xarray: 2024.10.1.dev51+g864b35a1
pandas: 2.2.3
numpy: 2.1.3
scipy: 1.14.1
netCDF4: None
pydap: None
h5netcdf: 1.4.1
h5py: 3.12.1
zarr: 3.0.0b3.dev6+g7c2ebe2
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.4.2
dask: 2024.12.1+0.g2c0ac83fc.dirty
distributed: 2024.12.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.10.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.3.0
pip: 24.3.1
conda: None
pytest: 8.3.3
mypy: None
IPython: 8.29.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions