Skip to content

[Bug]: Getting data CF-compliant for regridding (with xesmf) #718

Open
@pochedls

Description

@pochedls

What happened?

In regridding some ocean data, many (not all) datasets are returning:

ValueError: dataset must include lon/lat or be CF-compliant

I'm having trouble getting the datasets to be CF-compliant on xcdat 0.6.1 / 0.7.1 / 0.7.3. I think xcdat used to better handle ocean grids (though I'm not sure when this might have changed).

What did you expect to happen? Are there are possible answers you came across?

Ideally xcdat could figure out the input grid automatically. If not, this issue is still useful for sorting out what we need to specify in the metadata to make the dataset CF-compliant for xesmf.

Minimal Complete Verifiable Example (MVCE)

# imports
import xcdat as xc
import numpy as np

# target grid
nlat = xc.create_axis('lat', np.arange(-88.5, 90, 2.5))
nlon = xc.create_axis('lon', np.arange(1.25, 360, 2.5))
ngrid = xc.create_grid(x=nlon, y=nlat)

# open / regrid
p = '/p/css03/esgf_publish/CMIP6/ScenarioMIP/NCAR/CESM2-WACCM/ssp585/r1i1p1f1/Omon/fgco2/gn/v20200702/'
ds = xc.open_mfdataset(p)
ds = ds.regridder.horizontal('fgco2', ngrid, tool='xesmf', method='conservative_normed', periodic=True)

Relevant log output

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xesmf/frontend.py:75, in _get_lon_lat(ds)
     74 try:
---> 75     lon = ds.cf['longitude']
     76     lat = ds.cf['latitude']

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/cf_xarray/accessor.py:2343, in CFDatasetAccessor.__getitem__(self, key)
   2312 """
   2313 Index into a Dataset making use of CF attributes.
   2314
   (...)
   2341 Add additional keys by specifying "custom criteria". See :ref:`custom_criteria` for more.
   2342 """
-> 2343 return _getitem(self, key)

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/cf_xarray/accessor.py:936, in _getitem(accessor, key, skip)
    935 except KeyError:
--> 936     raise KeyError(
    937         f"{kind}.cf does not understand the key {k!r}. "
    938         f"Use 'repr({kind}.cf)' (or '{kind}.cf' in a Jupyter environment) to see a list of key names that can be interpreted."
    939     ) from None

KeyError: "Dataset.cf does not understand the key 'longitude'. Use 'repr(Dataset.cf)' (or 'Dataset.cf' in a Jupyter environment) to see a list of key names that can be interpreted."

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[1], line 13
     11 p = '/p/css03/esgf_publish/CMIP6/ScenarioMIP/NCAR/CESM2-WACCM/ssp585/r1i1p1f1/Omon/fgco2/gn/v20200702/'
     12 ds = xc.open_mfdataset(p)
---> 13 ds = ds.regridder.horizontal('fgco2', ngrid, tool='xesmf', method='conservative_normed', periodic=True)

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xcdat/regridder/accessor.py:205, in RegridderAccessor.horizontal(self, data_var, output_grid, tool, **options)
    203 input_grid = _get_input_grid(self._ds, data_var, ["X", "Y"])
    204 regridder = regrid_tool(input_grid, output_grid, **options)
--> 205 output_ds = regridder.horizontal(data_var, self._ds)
    207 return output_ds

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xcdat/regridder/xesmf.py:158, in XESMFRegridder.horizontal(self, data_var, ds)
    153 if input_da is None:
    154     raise KeyError(
    155         f"The data variable '{data_var}' does not exist in the dataset."
    156     )
--> 158 regridder = xe.Regridder(
    159     self._input_grid,
    160     self._output_grid,
    161     method=self._method,
    162     **self._extra_options,
    163 )
    165 output_da = regridder(input_da, keep_attrs=True)
    167 output_ds = xr.Dataset({data_var: output_da}, attrs=ds.attrs)

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xesmf/frontend.py:919, in Regridder.__init__(self, ds_in, ds_out, method, locstream_in, locstream_out, periodic, parallel, **kwargs)
    917     grid_in, shape_in, input_dims = ds_to_ESMFlocstream(ds_in)
    918 else:
--> 919     grid_in, shape_in, input_dims = ds_to_ESMFgrid(
    920         ds_in, need_bounds=need_bounds, periodic=periodic
    921     )
    922 if locstream_out:
    923     grid_out, shape_out, output_dims = ds_to_ESMFlocstream(ds_out)

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xesmf/frontend.py:145, in ds_to_ESMFgrid(ds, need_bounds, periodic, append)
    115 """
    116 Convert xarray DataSet or dictionary to ESMF.Grid object.
    117
   (...)
    141
    142 """
    143 # use np.asarray(dr) instead of dr.values, so it also works for dictionary
--> 145 lon, lat = _get_lon_lat(ds)
    146 if hasattr(lon, 'dims'):
    147     if lon.ndim == 1:

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xesmf/frontend.py:80, in _get_lon_lat(ds)
     76     lat = ds.cf['latitude']
     77 except (KeyError, AttributeError, ValueError):
     78     # KeyError if cfxr doesn't detect the coords
     79     # AttributeError if ds is a dict
---> 80     raise ValueError('dataset must include lon/lat or be CF-compliant')
     82 return lon, lat

ValueError: dataset must include lon/lat or be CF-compliant

Anything else we need to know?

Note the same code works with p = '/p/css03/esgf_publish/CMIP6/ScenarioMIP/BCC/BCC-CSM2-MR/ssp585/r1i1p1f1/Omon/fgco2/gn/v20190319/' (though I now think this may be because the grid is rectilinear).

Environment

xcdat 0.7.3
xesmf 0.8.8

Metadata

Metadata

Assignees

Labels

type: bugInconsistencies or issues which will cause an issue or problem for users or implementors.

Type

No type

Projects

  • Status

    In Progress

Relationships

None yet

Development

No branches or pull requests

Issue actions