Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataArrray.resample() appears to ignore skipna parameter while taking annual mean #7545

Open
4 tasks done
arfriedman opened this issue Feb 21, 2023 · 3 comments
Open
4 tasks done

Comments

@arfriedman
Copy link
Contributor

arfriedman commented Feb 21, 2023

What happened?

The skipna parameter appears to have no effect while annually resampling a monthly DataArray.
I'm not sure if this is related to #6772

What did you expect to happen?

I expected da_yrA to be NaN since I am not skipping NaN values.

Minimal Complete Verifiable Example

import numpy as np
import pandas as pd
import xarray as xr

data = np.arange(12)
month_index = pd.date_range("2022-01","2022-12-31", freq="M")
df = pd.DataFrame(index=month_index, data=data)
df.index.name = "time"
df.columns.name = "value"
df.loc["2022-01":"2022-02"] = np.nan
da = xr.DataArray(df)

# these unexpectedly produce the same result
da_yrA = (da).resample(time="A-Dec", skipna=False).mean()
da_yrB = (da).resample(time="A-Dec", skipna=True).mean()

print(da_yrA)
print(da_yrB)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

scipy: 1.10.0
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2023.2.0
distributed: 2023.2.0
matplotlib: 3.7.0
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.1.0
cupy: None
pint: None
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 67.3.2
pip: 23.0.1
conda: None
pytest: None
mypy: None
IPython: 8.10.0
sphinx: None
@arfriedman arfriedman added bug needs triage Issue that has not been reviewed by xarray team member labels Feb 21, 2023
@keewis
Copy link
Collaborator

keewis commented Feb 21, 2023

It seems that skipna has been ignored since #2541 (more than four years ago). While that might have been unintentional at the time I don't think resample should have that option, so maybe we should just raise a warning like we do for keep_attrs and eventually remove both.

Does this do what you expect?

In [2]: da.resample(time="A-Dec").mean(skipna=True)
Out[2]: 
<xarray.DataArray (time: 1, value: 1)>
array([[6.5]])
Coordinates:
  * value    (value) int64 0
  * time     (time) datetime64[ns] 2022-12-31

In [3]: da.resample(time="A-Dec").mean(skipna=False)
Out[3]: 
<xarray.DataArray (time: 1, value: 1)>
array([[nan]])
Coordinates:
  * value    (value) int64 0
  * time     (time) datetime64[ns] 2022-12-31

Edit: I don't think this is related to #6772, but I'll also have a look at what's wrong there

@keewis keewis removed bug needs triage Issue that has not been reviewed by xarray team member labels Feb 21, 2023
@arfriedman
Copy link
Contributor Author

Thanks @keewis -- your suggestion does what I expect.

@dcherian
Copy link
Contributor

I think we should raise and let the user know to provide skipna, keep_attrs etc. in the actual reduction method, not the groupby.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants