Skip to content

Add support for netcdf4 enum #8144

Closed
@bzah

Description

@bzah

Is your feature request related to a problem?

When a netcdf file contains netcdf4 enums , xarray ignores the underlying enum type.
The association between the values of the variable and their actual meaning is then lost.

MRE:

import netCDF4 as nc
import xarray as xr

# -- Create dataset with an enum using the netcdf4 lib
ds = nc.Dataset("mre.nc", "w", format="NETCDF4")   
cloud_type_enum = ds.createEnumType(int,"cloud_type",{"clear":0, "cloudy":1})
print(ds.enumtypes)
# {'cloud_type': <class 'netCDF4._netCDF4.EnumType'>: name = 'cloud_type', numpy dtype = int64, fields/values ={'clear': 0, 'cloudy': 1}} 
ds.createVariable("cloud", cloud_type_enum)
ds["cloud"][0] = 1
ds.close()

# -- Open dataset with xarray
xr_ds = xr.open_dataset("./mre.nc")
print(xr_ds.cloud)
# <xarray.DataArray 'cloud' ()> \n [1 values with dtype=int64]   
# --> We get no metadata about the cloud_type enum that we created above 
xr.ds.to_netcdf("mre_xr.nc")

# -- Open xarray outputted dataset with netCDF4 lib
print(nc.Dataset("mre_xr.nc", "r", format="NETCDF4").enumtypes())
# {}
# --> Empty dictionary: the enum we created is lost

If you know CF, enums could replace replace flag_meanings and flag_values, see CF
Enums are not yet part of CF though.

Describe the solution you'd like

As far as I understand, to describe the enum we only need a dictionary that map numbers (enum key) to string (enum value) and a way to reference this dictionary in variables that are "typed" to this enum.
Bear in mind that the dtype of the variable would still be a number, the enum type would be a secondary metadata.

Describe alternatives you've considered

Most people that produce data could get away with using flag_meanings and flag_values to describe their data in a way which is both CF proof and properly managed by xarray.
For me, the only workaround at the moment is to use the netCDF4 library directly.

Additional context

nc.__version__
# 1.6.2

xr.__version__
# 2023.2.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions