Skip to content

open_mfdataset fails on variable attributes with 'list' type #3034

Closed
@jfpiolle

Description

@jfpiolle

Using open_mfdataset on a series of netcdf files having variable attributes with type list will fail with the following exception, when these attributes have different values from one file to another:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
ncf = xarray.open_mfdataset(files) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/backends/api.py", line 658, in open_mfdataset ids=ids) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 553, in _auto_combine data_vars=data_vars, coords=coords) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 474, in _combine_nd compat=compat) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 492, in _auto_combine_all_along_first_dim data_vars, coords) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 510, in _auto_combine_1d for id, ds_group in grouped_by_vars] File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 368, in _auto_concat return concat(datasets, dim=dim, data_vars=data_vars, coords=coords) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 122, in concat return f(objs, dim, data_vars, coords, compat, positions) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 307, in _dataset_concat combined = concat_vars(vars, dim, positions) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/variable.py", line 1982, in concat return Variable.concat(variables, dim, positions, shortcut) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/variable.py", line 1433, in concat utils.remove_incompatible_items(attrs, var.attrs) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/utils.py", line 184, in remove_incompatible_items not compat(first_dict[k], second_dict[k]))): File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/utils.py", line 133, in equivalent (pd.isnull(first) and pd.isnull(second))) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

An example of such variable is provided below:

	double sea_ice_fraction(time) ;
		sea_ice_fraction:least_significant_digit = 2LL ;
		sea_ice_fraction:_FillValue = 1.e+20 ;
		sea_ice_fraction:long_name = "sea ice fraction" ;
		sea_ice_fraction:standard_name = "sea_ice_fraction" ;
		sea_ice_fraction:authority = "CF 1.7" ;
		sea_ice_fraction:units = "1" ;
		sea_ice_fraction:coverage_content_type = "auxiliaryInformation" ;
		sea_ice_fraction:coordinates = "time lon lat" ;
		sea_ice_fraction:source = "CCI Sea Ice" ;
		sea_ice_fraction:institution = "ESA" ;
		string sea_ice_fraction:source_files = "ice_conc_nh_ease2-250_cdr-v2p0_199912011200.nc", "ice_conc_sh_ease2-250_cdr-v2p0_199912011200.nc" ;

The exception will occur when the source_files attribute have a different values in the file time series I am trying to concatenate. I had to use the preprocess argument to remove first this attribute to avoid this exception.

This is caused by the equivalent method in xarray/core/utils.py that does not account for this case:

def equivalent(first, second):
    """Compare two objects for equivalence (identity or equality), using
    array_equiv if either object is an ndarray
    """
    # TODO: refactor to avoid circular import
    from . import duck_array_ops
    if isinstance(first, np.ndarray) or isinstance(second, np.ndarray):
        return duck_array_ops.array_equiv(first, second)
    else:
        return ((first is second) or
                (first == second) or
                (pd.isnull(first) and pd.isnull(second)))

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions