Skip to content

New Resample-Syntax leading to cancellation of dimensions #2356

Closed
@rpnaut

Description

@rpnaut

Example

Starting with the dataset located here: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/,
I want to calculate monthly sums of precipitation for each gridpoint in the daily data:

In [39]: data = array.open_dataset("eObs_gridded_0.22deg_rot_v14.0.TOT_PREC.1950-2016.nc_CutParamTimeUnitCor_FinalEvalGrid")
In [40]: data
Out[13]: 
<xarray.Dataset>
Dimensions:       (rlat: 136, rlon: 144, time: 153)
Coordinates:
  * rlon          (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 ...
  * rlat          (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 ...
  * time          (time) datetime64[ns] 2006-05-01T12:00:00 ...
Data variables:
    rotated_pole  int32 ...
    TOT_PREC      (time, rlat, rlon) float32 ...
Attributes:
    CDI:                       Climate Data Interface version 1.8.0 (http://m...
    Conventions:               CF-1.6
    history:                   Thu Jun 14 12:34:59 2018: cdo -O -s -P 4 remap...
    CDO:                       Climate Data Operators version 1.8.0 (http://m...
    cdo_openmp_thread_number:  4

In [41]: datamonth = data["TOT_PREC"].resample(time="M").sum()
In [42]: datamonth
Out[42]: 
<xarray.DataArray 'TOT_PREC' (time: 5)>
array([ 551833.25   ,  465640.09375,  328445.90625,  836892.1875 ,  503601.5    ], dtype=float32)
Coordinates:
  time     (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ...

Problem description

The problem is that the dimensions 'rlon' and 'rlat' and the corresponding coordinates have not survived the resample process. Only the time is present in the result.

Expected Output

I expect to have the spatial dimensions still in the output of monthly sums. The surprise is, that this is the case using the old syntax:

In [41]: datamonth = data["TOT_PREC"].resample(dim="time",freq="M",how="sum")
/usr/bin/ipython3:1: FutureWarning: 
.resample() has been modified to defer calculations. Instead of passing 'dim' and how="sum", instead consider using .resample(time="M").sum('time') 
  #!/usr/bin/env python3

In [42]: datamonth
Out[42]: 
<xarray.DataArray 'TOT_PREC' (time: 5, rlat: 136, rlon: 144)>
array([[[  0.      ,   0.      , ...,   0.      ,   0.      ],
        [  0.      ,   0.      , ...,   0.      ,   0.      ],
        ..., 
        [  0.      ,   0.      , ...,  44.900028,  41.400024],
        [  0.      ,   0.      , ...,  49.10001 ,  46.5     ]]], dtype=float32)
Coordinates:
  * time     (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ...
  * rlon     (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 -21.28 ...
  * rlat     (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 -11.22 ...

What is wrong here?

And maybe I can also ask the question why the new syntax did not consider use cases with high complex scripting? I do not like to use in my programs a hardcoded dimension name, i.e. time=${freq} instead of dim=${dim}; freq=${freq}.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions