Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling mean with bool performs sum #8864

Open
5 tasks done
chandley564 opened this issue Mar 21, 2024 · 2 comments
Open
5 tasks done

Rolling mean with bool performs sum #8864

chandley564 opened this issue Mar 21, 2024 · 2 comments

Comments

@chandley564
Copy link

What happened?

Taking a rolling mean of a DataArray with dytpe=bool doesn't behave as I would expect. Rather than converting to int and taking the rolling mean the result is equivilent to converting to int then taking a rolling sum.

What did you expect to happen?

No response

Minimal Complete Verifiable Example

import numpy as np
from xarray import DataArray

int_raster = DataArray(
    data=[0, 1, 1, 0, 1, 0],
    dims=("x"),
)

expected_rolling_mean = DataArray(
    data=[np.nan, 2 / 3, 2 / 3, 2 / 3, 1 / 3, np.nan],
    dims=("x"),
)

bool_raster = int_raster.astype(bool)

int_rolling_mean = int_raster.rolling(x=3, center=True).mean()
bool_rolling_mean = bool_raster.rolling(x=3, center=True).mean()
rolling_sum = int_raster.rolling(x=3, center=True).sum()

print("Expected: \n", expected_rolling_mean, "\n")
print("With int dtype: \n", int_rolling_mean, "\n")
print("With bool dtype: \n", bool_rolling_mean, "\n")
print("Rolling sum: \n", rolling_sum)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

Expected: 
 <xarray.DataArray (x: 6)> Size: 48B
array([       nan, 0.66666667, 0.66666667, 0.66666667, 0.33333333,
              nan])
Dimensions without coordinates: x 

With int dtype: 
 <xarray.DataArray (x: 6)> Size: 48B
array([       nan, 0.66666667, 0.66666667, 0.66666667, 0.33333333,
              nan])
Dimensions without coordinates: x 

With bool dtype: 
 <xarray.DataArray (x: 6)> Size: 48B
array([nan,  2.,  2.,  2.,  1., nan])
Dimensions without coordinates: x 

Rolling sum: 
 <xarray.DataArray (x: 6)> Size: 48B
array([nan,  2.,  2.,  2.,  1., nan])
Dimensions without coordinates: x

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('English_New Zealand', '1252')
libhdf5: None
libnetcdf: None

xarray: 2024.2.0
pandas: 2.1.4
numpy: 1.26.2
scipy: 1.12.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.3.1
distributed: None
matplotlib: 3.8.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.3.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.0
pip: 23.2.1
conda: None
pytest: 7.4.3
mypy: None
IPython: 8.18.1
sphinx: 6.2.1

@chandley564 chandley564 added bug needs triage Issue that has not been reviewed by xarray team member labels Mar 21, 2024
Copy link

welcome bot commented Mar 21, 2024

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@max-sixty
Copy link
Collaborator

FWIW this seems to be correct under numbagg or bottleneck; so it's an issue with the naive xarray routines. We could just raise an error there.

Expected:
 <xarray.DataArray (x: 6)> Size: 48B
array([       nan, 0.66666667, 0.66666667, 0.66666667, 0.33333333,
              nan])
Dimensions without coordinates: x

With int dtype:
 <xarray.DataArray (x: 6)> Size: 48B
array([       nan, 0.66666667, 0.66666667, 0.66666667, 0.33333333,
              nan])
Dimensions without coordinates: x

With bool dtype:
 <xarray.DataArray (x: 6)> Size: 48B
array([       nan, 0.66666667, 0.66666667, 0.66666667, 0.33333333,
              nan])
Dimensions without coordinates: x

Rolling sum:
 <xarray.DataArray (x: 6)> Size: 48B
array([nan,  2.,  2.,  2.,  1., nan])
Dimensions without coordinates: x

@max-sixty max-sixty removed the needs triage Issue that has not been reviewed by xarray team member label Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants