Skip to content

REGR: CategoricalIndex and IntervalIndex are missing _index_data attribute #31248

Closed
@jschendel

Description

@jschendel

It looks like the _index_data attribute wasn't added to CategoricalIndex or IntervalIndex. This leads to some regressions with GroupBy.apply.

Not sure what the fix should be, as there isn't a singular array that directly backs the underlying values for CategoricalIndex or IntervalIndex, so the approach used to add _index_data for other types doesn't look like it will work here.

Example of broken behavior on master:

In [1]: import pandas as pd; pd.__version__
Out[1]: '1.0.0rc0+146.g6f395ad42'

In [2]: index = pd.CategoricalIndex(list("abc"))  
   ...: df = pd.DataFrame({"group": [0, 0, 1], "value": [1, 2, 3]}, index=index)

In [3]: df.groupby("group").apply(lambda x: x)
---------------------------------------------------------------------------
AttributeError: 'CategoricalIndex' object has no attribute '_index_data'

Same behavior worked on 0.25.3:

In [1]: import pandas as pd; pd.__version__
Out[1]: '0.25.3'

In [2]: index = pd.CategoricalIndex(list("abc"))  
   ...: df = pd.DataFrame({"group": [0, 0, 1], "value": [1, 2, 3]}, index=index)

In [3]: df.groupby("group").apply(lambda x: x)
Out[3]: 
   group  value
a      0      1
b      0      2
c      1      3

Swapping CategoricalIndex with IntervalIndex results in the same behavior above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeGroupbyIntervalInterval data typeRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions