Skip to content

Downsampling with resample generates longer time array than expected #3773

Open
@ivanhigueram

Description

@ivanhigueram

MCVE Code Sample

import xarray as xr
import pandas as pd
from functools import reduce

# Create time dimension array for all climatological winters
indexes = [pd.DatetimeIndex(f'{year}-12-01', f'{year+1}-03-01', freq='12H') 
                 for year in range(1980, 2020)]
index_union = reduce(pd.Index.union, indexes)

# Create DataArray
ds = xr.Dataset({'var': ('time', np.arange(len(ix_union))), 'time': index_union})

Problem Description

From here we can check ds time dimension months:

>>> pd.DatetimeIndex(ds.time.values).month.unique()
 Int64Index([12, 1, 2, 3], dtype='int64')

Now, if we downsample our Dataset to a one week period:

>>> pd.DatetimeIndex(ds.resample(time='1W').mean().time.values).month.unique()
Int64Index([12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype='int64')

Instead of getting the weeks within the months in the original Dataset, we obtain additional months with missing values in the var array. One way of solving this issue is to use the dropna() method, but is a slow approach.

Expected Output

We would expect something like this:

>>> pd.DatetimeIndex(ds.resample(time='1W').mean().time.values).month.unique()
 Int64Index([12, 1, 2, 3], dtype='int64')

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.12.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3

xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.5
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.7.7
iris: 2.3.0
bottleneck: None
dask: 2.9.2
distributed: 2.9.3
matplotlib: 3.1.1
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 44.0.0.post20200106
pip: 19.3.1
conda: None
pytest: None
IPython: 7.11.1
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions