Skip to content

When try to assign a pd.MultiIndex to the coords, the behavior is different between "Dimension coordinate" and "Non-dimension coordinate" #5214

Closed
@weipeng1999

Description

@weipeng1999

Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly.
-->

What happened:

When I try to assign an instance of pd.MultiIndex to the coords:
the behavior of "Dimension coordinate" is to maintain the multi-index so I can use multi-index levels directly as keyword arguments,
while the behavior of "Non-dimension coordinate" is to change the index to an np.ndarray with dtype "object" , that make above function failed.

What you expected to happen:

I want the "Non-dimension coordinate" can also maintain the multi-index

Minimal Complete Verifiable Example:

>>> import xarray as xr
>>> import numpy as np
>>> import pandas as pd

# create a dataarray named 'arr'
>>> arr = xr.DataArray(np.r_[:6],{},'z')
# add coords 'z1' in dim 'z'
# this time 'z1' is a "Non-dimension coordinate"
>>> arr.coords['z1'] = 'z',pd.MultiIndex.from_product(
    ([1,2,3],['a','b']),names=['i','n'] )
# let the coords 'z1' to be the z's coords
>>> arr.set_index(z='z1')
>>> arr.sel(n='a') # Failed to use index 'n' in multiindex 'z'
ValueError: dimensions or multi-index levels ['n'] do not exist
# let's see what's coords 'z' look like
>>> arr.z
<xarray.DataArray 'z' (z: 6)>
array([(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')],
      dtype=object)
Coordinates:
  * z        (z) object (1, 'a') (1, 'b') (2, 'a') (2, 'b') (3, 'a') (3, 'b')
# why 'z' is not a MultiIndex ???

# now 'z' is a "Dimension coordinate"
# set the coords again
>>> arr1.coords['z'] = pd.MultiIndex.from_product(
    ([1,2,3],['a','b']),names=['i','n'] )
>>> arr.sel(n='a') #OK
<xarray.DataArray (i: 3)>
array([0, 2, 4])
Coordinates:
  * i        (i) int64 1 2 3
# let's see what's coords 'z' look like
>>> arr.z
<xarray.DataArray 'z' (z: 6)>
array([(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')],
      dtype=object)
Coordinates:
  * z        (z) MultiIndex
  - i        (z) int64 1 1 2 2 3 3
  - n        (z) object 'a' 'b' 'a' 'b' 'a' 'b'
# 'z' is successfully setted to a MultiIndex

Anything else we need to know?:

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.2 (default, Mar 25 2020, 17:03:02)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.10.27-gentoo-dist
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: None

xarray: 0.17.0
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.6.2
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.8.5
iris: None
bottleneck: None
dask: 2021.04.0
distributed: 2021.04.0
matplotlib: 3.4.1
cartopy: 0.18.0
seaborn: 0.11.1
numbagg: None
pint: None
setuptools: 49.6.0.post20210108
pip: 21.0.1
conda: 4.10.1
pytest: None
IPython: 7.22.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions