Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in average_over_dims: NotImplementedError: Computing the mean of an array containing cftime.datetime objects is not yet implemented on dask arrays. #185

Open
agstephens opened this issue Sep 6, 2021 · 4 comments
Assignees

Comments

@agstephens
Copy link
Collaborator

  • clisops version: 0.6.5
  • Python version: 3.7+
  • Operating System: All

Description

Tests demonstrate an error. Computing with ds.mean() over an array containing cftime.datetime objects is not yet implemented on dask arrays.

Googling suggests that one solution is to force the data to be loaded, so it is no longer a delayed dask array, e.g.:

ds = ds.load()

What I Did

These tests demonstrate the error:

============================================================== short test summary info ==============================================================
FAILED tests/ops/test_average.py::test_average_lat_xarray - NotImplementedError: Computing the mean of an array containing cftime.datetime objects...
FAILED tests/ops/test_average.py::test_average_lon_xarray - NotImplementedError: Computing the mean of an array containing cftime.datetime objects...
FAILED tests/ops/test_average.py::test_average_lat_nc - NotImplementedError: Computing the mean of an array containing cftime.datetime objects is ...
FAILED tests/ops/test_average.py::test_average_lon_nc - NotImplementedError: Computing the mean of an array containing cftime.datetime objects is ...
FAILED tests/ops/test_xarray_mean.py::test_xarray_da_mean_keep_attrs_true - NotImplementedError: Computing the mean of an array containing cftime....
======================================== 5 failed, 231 passed, 20 skipped, 330 warnings in 287.45s (0:04:47) ========================================
@agstephens agstephens self-assigned this Sep 6, 2021
@agstephens
Copy link
Collaborator Author

Since the average_over_dims operation is not currently used in our production systems (e.g. the rook WPS), the quick fix is just to load() the xr.Dataset before processing. This will not change the functionality. The only risk is that it will attempt to load any size of dataset from disk - so could cause memory issues. This is not a problem at this point in time because we don't use the functionality.

agstephens added a commit that referenced this issue Sep 6, 2021
- Error was: NotImplementedError: Computing the mean of an array containing
  cftime.datetime objects is not yet implemented on dask arrays
- Solution was to force loading of the Dataset, with `ds.load().mean()`
- See: #185
@ellesmith88 ellesmith88 mentioned this issue Sep 28, 2021
5 tasks
@agstephens
Copy link
Collaborator Author

We can undo this in future when the xarray/pandas versions are brought into line.

@aulemahal
Copy link
Collaborator

I opened this issue here : pydata/xarray#5897. I believe the current bug has nothing to do with pandas or a version mismatch, rather then a PR introducing bugs on the xarray side.

All in all, this only happens in the test suite because of the time_bnds variable that is present on some datasets. I would suggest either removing the variable, waiting for xarray or introducing a workaround directly in average_over_dims to skip this faulty variable (which I can send a PR for). Everything but loading data unexpectedly.

@ellesmith88
Copy link
Collaborator

Thanks @aulemahal I've just read this through. We aren't using average_over_dims and we think it will be removed/refactored in the future, so a workaround can be introduced or the tests can be skipped

@aulemahal aulemahal mentioned this issue Oct 26, 2021
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants