Skip to content

ds.mean() deletes non numeric variables #2311

Closed
@lukasbrunner

Description

@lukasbrunner

Code Sample, a copy-pastable example if possible

import xarray as xr
ds = xr.Dataset(
    coords={'dim': range(5)},
    data_vars={'data': 'string'})
print(ds.mean('dim'))  # print 1

Problem description

I know one should probably not use strings as variables but since the possibility exists I use it every now and again because I find it convenient. However recently I encountered a somewhat strange case when taking the mean.

Calculating the mean over a dimension deletes all non-numeric variables even if they do not depend on the given dimension. Variables not depending on the dimension over which the mean is taken should be untouched I think?

# print 1
<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*

Expected Output

I would expect the variable 'data' to still be in the dataset.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.138-59-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

xarray: 0.10.7
pandas: 0.22.0
numpy: 1.14.3
scipy: 1.1.0
netCDF4: 1.3.1
h5netcdf: 0.5.0
h5py: 2.7.1
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: 1.0.0
dask: 0.17.2
distributed: 1.21.5
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: 0.8.1
setuptools: 39.0.1
pip: 9.0.3
conda: None
pytest: 3.5.0
IPython: 6.3.1
sphinx: 1.7.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions