Closed
Description
It looks like the _index_data
attribute wasn't added to CategoricalIndex
or IntervalIndex
. This leads to some regressions with GroupBy.apply
.
Not sure what the fix should be, as there isn't a singular array that directly backs the underlying values for CategoricalIndex
or IntervalIndex
, so the approach used to add _index_data
for other types doesn't look like it will work here.
Example of broken behavior on master
:
In [1]: import pandas as pd; pd.__version__
Out[1]: '1.0.0rc0+146.g6f395ad42'
In [2]: index = pd.CategoricalIndex(list("abc"))
...: df = pd.DataFrame({"group": [0, 0, 1], "value": [1, 2, 3]}, index=index)
In [3]: df.groupby("group").apply(lambda x: x)
---------------------------------------------------------------------------
AttributeError: 'CategoricalIndex' object has no attribute '_index_data'
Same behavior worked on 0.25.3:
In [1]: import pandas as pd; pd.__version__
Out[1]: '0.25.3'
In [2]: index = pd.CategoricalIndex(list("abc"))
...: df = pd.DataFrame({"group": [0, 0, 1], "value": [1, 2, 3]}, index=index)
In [3]: df.groupby("group").apply(lambda x: x)
Out[3]:
group value
a 0 1
b 0 2
c 1 3
Swapping CategoricalIndex
with IntervalIndex
results in the same behavior above.