-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow initilization of dataset.interp #4739
Comments
We don't support lazy index variables yet (#1603) so you can't interpolate to a dask variable.
This may be true. I think we could convert Lines 641 to 643 in bf0fe2c
OTOH I found some easier optimizations. See #4740
|
Now implemented. Runtime has dropped from 5.3s to 2.3s (!) |
What happened:
When interpolating a dataset with >2000 dask variables a lot of time is spent in
da.unifying_chunks
becauseda.unifying_chunks
forces all variables and coordinates to a dask array.xarray on the other hand forces coordinates to pd.Index even if the coordinates was dask.array when the dataset was first created.
What you expected to happen:
If the coords of the dataset was initialized as dask arrays they should stay lazy.
Minimal Complete Verifiable Example:
Anything else we need to know?:
Some thoughts:
missing.interp_func
. But some time could be saved if we could convert them to dask arrays inxr.Dataset.interp
before the variable loop starts.Environment:
Output of xr.show_versions()
xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
libhdf5: 1.10.4
libnetcdf: None
xarray: 0.16.2
pandas: 1.1.5
numpy: 1.17.5
scipy: 1.4.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2020.12.0
distributed: 2020.12.0
matplotlib: 3.3.2
cartopy: None
seaborn: 0.11.1
numbagg: None
pint: None
setuptools: 51.0.0.post20201207
pip: 20.3.3
conda: 4.9.2
pytest: 6.2.1
IPython: 7.19.0
sphinx: 3.4.0
The text was updated successfully, but these errors were encountered: