Description
What happened?
First, apology if this is not actually a bug - I'm not too sure of what the intended behaviour should be. But I find this counter-intuitive.
When indexing a DataArray
that is indexed using a RangeIndex
, the resulting index is an Int64Index
:
my_da.get_index('time')
>>> RangeIndex(start=0, stop=100, step=1, name='time')
a = my_da.sel({'time': pd.RangeIndex(0,2)})
a.get_index('time')
>>> Int64Index([0, 1], dtype='int64', name='time')
Setting the index to the desired RangeIndex
using assign_coords()
then works. But I find it a bit problematic that sel()
returns an Int64Index
even when used with a RangeIndex
. Also because Int64Index
has been recently deprecated in Pandas 1.4.
What did you expect to happen?
I would have expected the resulting DataArray
to be indexed with the same RangeIndex
used in sel()
.
Minimal Complete Verifiable Example
import xarray as xr
import numpy as np
import pandas as pd
my_da = xr.DataArray(np.random.rand(100,),
dims=('time'),
coords={'time': pd.RangeIndex(0, 100)})
print(my_da.get_index('time'))
a = my_da.sel({'time': pd.RangeIndex(0,2)})
print(a.get_index('time'))
Relevant log output
RangeIndex(start=0, stop=100, step=1, name='time')
Int64Index([0, 1], dtype='int64', name='time')
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.8.5 (default, Sep 4 2020, 02:22:02)
[Clang 10.0.0 ]
python-bits: 64
OS: Darwin
OS-release: 20.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 0.20.2
pandas: 1.4.0
numpy: 1.22.1
scipy: 1.7.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.5.1
cartopy: None
seaborn: None
numbagg: None
fsspec: 2021.11.1
cupy: None
pint: None
sparse: None
setuptools: 59.5.0
pip: 21.3.1
conda: None
pytest: 6.2.5
IPython: 8.0.1
sphinx: 4.3.2