Skip to content

ENH: groupby.apply for Categorical groupers should preserve categories (like .agg) #10138

Closed
@jreback

Description

@jreback

from SO

missing = pd.Categorical(list('aaa'), categories=['a', 'b'])
dense = pd.Categorical(list('abc'))
values = np.arange(len(dense))
df = pd.DataFrame({'missing': missing, 'dense': dense, 'values': values})

grouped = df.groupby(['missing', 'dense'])

# does reindex output for missing categories
grouped.mean()
grouped.agg(np.mean)

# does not reindex the output for the missing categories
grouped.apply(lambda chunk: np.mean(chunk))

So the _wrap_applied_output need a call to _reindex_output as a post-processing step.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions