Closed
Description
Currently, when xarray.coding.times.encode_cf_datetime()
is called, it always casts the input to a NumPy array. This is not what I would expect when the input is a dask array. I am wondering if we could make this operation lazy when the input is a dask array?
Lines 352 to 354 in 01462d6
In [46]: import numpy as np
In [47]: import xarray as xr
In [48]: import pandas as pd
In [49]: times = pd.date_range("2000-01-01", "2001-01-01", periods=11)
In [50]: time_bounds = np.vstack((times[:-1], times[1:])).T
In [51]: arr = xr.DataArray(time_bounds).chunk()
In [52]: arr
Out[52]:
<xarray.DataArray (dim_0: 10, dim_1: 2)>
dask.array<xarray-<this-array>, shape=(10, 2), dtype=datetime64[ns], chunksize=(10, 2), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0, dim_1
In [53]: xr.coding.times.encode_cf_datetime(arr)
Out[53]:
(array([[ 0, 52704],
[ 52704, 105408],
[105408, 158112],
[158112, 210816],
[210816, 263520],
[263520, 316224],
[316224, 368928],
[368928, 421632],
[421632, 474336],
[474336, 527040]]),
'minutes since 2000-01-01 00:00:00',
'proleptic_gregorian')
Cc @jhamman