Skip to content

BUG: Inconsistency with grouping columns in agg #50383

Closed
@rhshadrach

Description

@rhshadrach

Related to #46944

Users can currently request the grouping column be part of the computation for various ops by including them as part of a __getitem__. But agg will still exclude these columns.

df = pd.DataFrame({'a': [1, 1, 2], 'b': 3, 'c': 4, 'd': 5})
gb = df.groupby(['a', 'b'])[['a', 'c']]

result = gb.sum()
print(result)
#      a  c
# a b      
# 1 3  2  8
# 2 3  2  4

result2 = gb.agg(lambda x: x.sum())
print(result2)
#      c
# a b   
# 1 3  8
# 2 3  4

I would expect __getitem__ to only subset columns for groupby rather than being able to add additional (grouping) columns.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions