Skip to content

merge drops attributes #3865

Closed
Closed
@johnomotani

Description

@johnomotani

xarray.merge() drops the attrs of Datasets being merged. They should be kept, at least if they are compatible

MCVE Code Sample

# Your code here
import xarray as xr

ds1 = xr.Dataset()
ds1.attrs['a'] = 42
ds2 = xr.Dataset()
ds2.attrs['a'] = 42

merged = xr.merge([ds1, ds2])

print(merged)

the result is

<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*

Expected Output

<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*
Attributes:
    a:        42

Problem Description

The current behaviour means I have to check and copy attrs to the result of merge by hand, even if the attrs of the inputs were identical or not conflicting.

I'm happy to attempt a PR to fix this.
Proposal (following pattern of compat arguments):

  • add a combine_attrs argument to xarray.merge
  • combine_attrs = 'drop' do not copy attrs (current behaviour)
  • combine_attrs = 'identical' if attrs of all inputs are identical (using dict_equiv) then copy the attrs to the result, otherwise raise an exception
  • combine_attrs = 'no_conflicts' merge the attrs of all inputs, as long as any keys shared by more than one input have the same value (if not raise an exception) [I propose this is the default behaviour]
  • override copy the attrs from the first input, to the result

This proposal should also allow combine_by_coords, etc. to preserve attributes. These should probably also take a combine_attrs argument, which would be passed through to merge.

Versions

Current master of pydata/xarray on 17/3/2020

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.9 (default, Nov 7 2019, 10:44:02) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.3.0-40-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.6.3

xarray: 0.15.0
pandas: 1.0.2
numpy: 1.18.1
scipy: 1.3.0
netCDF4: 1.5.1.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.12.0
distributed: None
matplotlib: 3.1.1
cartopy: None
seaborn: None
numbagg: None
setuptools: 45.2.0
pip: 9.0.1
conda: None
pytest: 4.4.1
IPython: 7.8.0
sphinx: 1.8.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions