Skip to content

Fail to sel() when index comes from categorical pandas Series #3669

Closed
@mancellin

Description

@mancellin

Dear xarray team,

Thank you very much for your work on this useful package.
Here is a bug I just found in my code.

MCVE Code Sample

Creating a Dataset from pandas when the coordinate is a categorical series:

import pandas as pd

ind = pd.Series(['foo', 'bar'], dtype='category')
df = pd.DataFrame({'ind': ind, 'values': [1, 2]})
df = df.set_index('ind')

ds = df.to_xarray()

print(ds.sel(ind='foo'))

Expected Output

When ind is not categorical, the code returns the expected output:

<xarray.Dataset>
Dimensions:  ()
Coordinates:
    ind      <U3 'foo'
Data variables:
    values   int64 1

Problem Description

When ind is categorical, it fails and gives the following traceback:

Traceback (most recent call last):
  File "/home/matthieu/foo.py", line 9, in <module>
    print(ds.sel(ind='foo'))
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/dataset.py", line 2013,
in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/coordinates.py", line 39
1, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/indexing.py", line 260,
in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/indexing.py", line 179,
in convert_label_indexer
    indexer = index.get_loc(
TypeError: get_loc() got an unexpected keyword argument 'tolerance'

Output of xr.show_versions()

commit: None python: 3.8.0 (default, Nov 6 2019, 21:49:08) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.19.91-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: fr_FR.UTF-8 libhdf5: None libnetcdf: None

xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.4
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 44.0.0.post20200106
pip: 19.3.1
conda: None
pytest: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions