Description
The discussion of the use of the indexes
property in #5102 got me thinking about this StackOverflow answer. For a while I have thought that my answer there isn't very satisfying, not only because it relies on this somewhat obscure indexes
property, but also because it only works on dimension coordinates -- i.e. something that would be backed by an index.
Describe the solution you'd like
It would be better if we could do this conversion with astype
, e.g. da.astype("datetime64[ns]")
. This would allow conversion to datetime64
values for all cftime.datetime
DataArrays -- dask-backed or NumPy-backed, 1D or ND -- through a fairly standard and well-known method. To my surprise, while you do not get the nice calendar-switching warning that CFTimeIndex.to_datetimeindex
provides, this actually already kind of seems to work (?!):
In [1]: import xarray as xr
In [2]: times = xr.cftime_range("2000", periods=6, calendar="noleap")
In [3]: da = xr.DataArray(times.values.reshape((2, 3)), dims=["a", "b"])
In [4]: da.astype("datetime64[ns]")
Out[4]:
<xarray.DataArray (a: 2, b: 3)>
array([['2000-01-01T00:00:00.000000000', '2000-01-02T00:00:00.000000000',
'2000-01-03T00:00:00.000000000'],
['2000-01-04T00:00:00.000000000', '2000-01-05T00:00:00.000000000',
'2000-01-06T00:00:00.000000000']], dtype='datetime64[ns]')
Dimensions without coordinates: a, b
NumPy obviously does not officially support this -- nor would I expect it to -- so I would be wary of simply documenting this behavior as is. Would it be reasonable for us to modify xarray.core.duck_array_ops.astype
to explicitly implement this conversion ourselves for cftime.datetime
arrays? This way we could ensure this was always supported, and we could include appropriate errors for out-of-bounds times (the NumPy method currently overflows in that case) and warnings for switching from non-standard calendars.