-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bottleneck and dask objects ignore min_periods
on rolling
#4922
Comments
Maybe the padding is breaking down for length-1 arrays? I would look at Lines 461 to 463 in c9b9eec
|
@dcherian, to add to the complexity here, it's even weirder than originally reported. See my test cases below. This might alter how this bug is approached. import xarray as xr
def _rolling(ds):
return ds.rolling(time=6, center=False, min_periods=1).mean()
# Length 3 array to test that min_periods is called in, despite asking
# for 6 time-steps of smoothing
ds = xr.DataArray([1, 2, 3], dims='time')
ds['time'] = xr.cftime_range(start='2021-01-01', freq='D', periods=3) 1. With
|
I feel like this should not work i.e. rolling window length (6) < size along axis (3). So the bottleneck error seems right. The chunk size error in the last example should go away with #4977 |
This is normally the case, but with Thanks for the pointer on #4977! |
min_periods
on rolling
min_periods
on rolling
encountered the same problem by Bottleneck.move_rank() ; vol_list =[3,5,10,20,30,60,90]
for i in vol_list:
if len(last_df) >= i:
last_df['vol_big_than_today_count_'+str(i)] = move_rank(last_df['vol'], i) |
What happened:
When
bottleneck
is installed in an environment, it seems to ignore themin_periods
kwarg onds.rolling(...)
.What you expected to happen:
When using
ds.rolling(..., min_periods=1)
, it should be able to handle an array of length 1. Withoutbottleneck
installed, it returns the original value of a length 1 array. Withbottleneck
installed, the error is:Minimal Complete Verifiable Example:
With
bottleneck
installed to environment:Without
bottleneck
installed to environment:Anything else we need to know?:
In an applied case, this came up while working on
.groupby('time.dayofyear').map(_rolling)
, where we map a rolling mean function over a defined N days withmin_periods=1
. Some climatological days (like leap years) will not have the N day requirement, so themin_period
catch handles that, but withbottleneck
installed it breaks due to the above issue.Environment:
Output of xr.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Jan 25 2021, 23:22:12) [Clang 11.0.1 ] python-bits: 64 OS: Darwin OS-release: 19.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4xarray: 0.16.2
pandas: 1.2.1
numpy: 1.19.5
scipy: 1.6.0
netCDF4: 1.5.5.1
pydap: None
h5netcdf: 0.8.1
h5py: 3.1.0
Nio: None
zarr: 2.6.1
cftime: 1.3.1
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: 1.1.8
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.01.1
distributed: 2021.01.1
matplotlib: 3.3.3
cartopy: 0.18.0
seaborn: 0.11.1
numbagg: None
pint: 0.16.1
setuptools: 49.6.0.post20210108
pip: 21.0
conda: None
pytest: 6.2.2
IPython: 7.18.1
sphinx: 3.4.3
The text was updated successfully, but these errors were encountered: