Description
Working with Argo data, I have difficulties decoding time-related variables:
More specifically, it may happens that a variable being a date contains FillValue that are set to NaN at the opening of the netcdf file. That makes the decoding to raise an error.
Sure I can open the netcdf file with the decode_times = False option but it's not an issue of being able or not to decode the data, it seems to me to be about how to handle FillValue in a time axis.
I understand that with most of gridded datasets, the time axis/dimension/coordinate is full and does not contains missing values, that may be explaining why nobody have reported this before.
Here is a simple way to reproduce the error:
attrs = {'units': 'days since 1950-01-01 00:00:00 UTC'} # Classic Argo data Julian Day units
# OK !
jd = [24658.46875, 24658.46366898, 24658.47256944] # Sample of Julian date from Argo data
ds = xr.Dataset({'time': ('time', jd, attrs)})
print xr.decode_cf(ds)
<xarray.Dataset>
Dimensions: (time: 3)
Coordinates:
* time (time) datetime64[ns] 2017-07-06T11:15:00 ...
Data variables:
*empty*
But then:
# Not OK with a NaN
jd = [24658.46875, 24658.46366898, 24658.47256944, np.NaN] # Another sample of Julian date from Argo data
ds = xr.Dataset({'time': ('time', jd, attrs)})
print xr.decode_cf(ds)
ValueError: unable to decode time units 'days since 1950-01-01 00:00:00 UTC' with the default calendar. Try opening your dataset with decode_times=False. Full traceback:
Traceback (most recent call last):
File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/conventions.py", line 389, in __init__
result = decode_cf_datetime(example_value, units, calendar)
File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/conventions.py", line 157, in decode_cf_datetime
dates = _decode_datetime_with_netcdf4(flat_num_dates, units, calendar)
File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/conventions.py", line 99, in _decode_datetime_with_netcdf4
dates = np.asarray(nc4.num2date(num_dates, units, calendar))
File "netCDF4/_netCDF4.pyx", line 5244, in netCDF4._netCDF4.num2date (netCDF4/_netCDF4.c:64839)
ValueError: cannot convert float NaN to integer
I would expect the decoding to work like in the first case and to simply preserve NaNs where they are.
Any ideas or suggestions ?
Thanks