Description
# Imports
import numpy as np
import xarray as xr
import pandas as pd
from glob import glob
# files to be concatenated
files = sorted(glob(path + str(1988) + '/V250*'))
# corrected dates
dates = pd.date_range(start=str(yr), end=str(yr+1), freq='6H', closed='left')
ds_test = xr.open_mfdataset(files[:10], combine='nested', concat_dim='time', decode_cf=False)
# correcting time
ds_test.time.values=dates[:10]
# fixing encoding
ds_test.time.attrs['units'] = "Seconds since 1970-01-01 00:00:00"
# preview of the time variable
print(ds_test.time)
> <xarray.DataArray 'time' (time: 10)>
array(['1988-01-01T00:00:00.000000000', '1988-01-01T06:00:00.000000000',
'1988-01-01T12:00:00.000000000', '1988-01-01T18:00:00.000000000',
'1988-01-02T00:00:00.000000000', '1988-01-02T06:00:00.000000000',
'1988-01-02T12:00:00.000000000', '1988-01-02T18:00:00.000000000',
'1988-01-03T00:00:00.000000000', '1988-01-03T06:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 1988-01-01 ... 1988-01-03T06:00:00
Attributes:
calendar: proleptic_gregorian
standard_name: time
units: Seconds since 1970-01-01 00:00:00
ds_test.to_netcdf(path+'test.nc')
>ValueError: failed to prevent overwriting existing key units in attrs on variable 'time'.
This is probably an encoding field used by xarray to describe how a variable is serialized.
To proceed, remove this key from the variable's attributes manually.
Expected Output
Correctly encode time
such that it saves the file by correctly converting value of time
according to the reference units. I have the flexibility of dropping CF-conventions as long as time values are correct but it would also be nice to have a solution which keeps the CF-conventions intact.
Problem Description
I'm trying to concatenate netcdf
files which have CF
conventions mentioned in their global attributes. These files have an incorrect time dimension which I try to fix with the code above. It seems that some existing encoding is preventing from writing the files back. But when I print the encoding, it doesn't show any such clashing units
. I'm not sure if this is a bug or a wrong usage issue. Thus, any usage help on how to correctly encode time
such that it saves the time values by correctly converting according to the reference units is much appreciated.
# More diagnostics on the encoding
print(ds_test.encoding)
>{'unlimited_dims': {'time'},
'source': '/file/to/path/V250_19880101_00'}
# checking any existing time
print(ds_test.time.encoding)
>{}
# another try on setting time encoding
ds_test.time.encoding['units'] = "Seconds since 1970-01-01 00:00:00"
# writing the file gives the same ValueError as above
ds_test.to_netcdf(path+'test.nc')
# ncdump output of one of the files
>netcdf V250_19880101_06 {
dimensions:
lon = 720 ;
lat = 361 ;
lev = 1 ;
time = UNLIMITED ; // (1 currently)
variables:
float lon(lon) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
lon:standard_name = "longitude" ;
lon:axis = "X" ;
float lat(lat) ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
lat:standard_name = "latitude" ;
lat:axis = "Y" ;
float lev(lev) ;
lev:long_name = "hybrid level at layer midpoints" ;
lev:units = "level" ;
lev:standard_name = "hybrid_sigma_pressure" ;
lev:positive = "down" ;
lev:formula = "hyam hybm (mlev=hyam+hybm*aps)" ;
lev:formula_terms = "ap: hyam b: hybm ps: aps" ;
float time(time) ;
time:units = "hours since 1988-01-01 06:00:00" ;
time:calendar = "proleptic_gregorian" ;
time:standard_name = "time" ;
float V(time, lev, lat, lon) ;
V:long_name = "unknown (please add with NCO)" ;
V:units = "unknown (please add with NCO)" ;
V:_FillValue = -999.99f ;
// global attributes:
:Conventions = "CF" ;
:constants_file_name = "P19880101_06" ;
:institution = "IACETH" ;
:lonmin = -180.f ;
:lonmax = 179.5f ;
:latmin = -90.f ;
:latmax = 90.f ;
:levmin = 250.f ;
:levmax = 250.f ;
:history = "Fri Sep 6 15:59:17 2019: ncatted -a units,time,o,c,hours since 1988-01-01 06:00:00 -a standard_name,time,o,c,time V250_19880101_06" ;
:NCO = "4.7.2" ;
data:
time = 6 ;
}
Output of xr.show_versions()
xarray: 0.13.0
pandas: 0.25.3
numpy: 1.18.1
scipy: 1.3.2
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.9.2
distributed: 2.9.3
matplotlib: 3.1.0
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 44.0.0.post20200106
pip: 19.3.1
conda: None
pytest: None
IPython: 7.11.1
sphinx: None