Skip to content

DOC: df.groupby('A') is just syntactic sugar for df.groupby(df['A']) #51063

Closed
@Sirmadeira

Description

@Sirmadeira

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/user_guide/groupby.html

Especifically on the line

df.groupby('A') is just syntactic sugar for df.groupby(df['A']).

A list of any of the above things.

Documentation problem

Well here is a sample on how is not just syntactic sugar. I think

test_df = pd.DataFrame({'Category': {0: 'product-availability address-confirmation input',
  1: 'registration register-data-confirmation options',
  2: 'onboarding return-start input',
  3: 'registration register-data-confirmation input',
  4: 'decision-tree first-interaction-validation options'},
 'Original_UserId': {0: '5511949551865@wa.gw.msging.net',
  1: '5511949551865@wa.gw.msging.net',
  2: '5511949551865@wa.gw.msging.net',
  3: '5511949551865@wa.gw.msging.net',
  4: '5511949551865@wa.gw.msging.net'}})

If I run
test_df['Category'].eq('onboarding return-start input').groupby(test_df['Original_UserId']).cummax()

This gives a result

If I run

test_df['Category'].eq('onboarding return-start input').groupby('Original_UserId').cummax()
I get keyerror

I am guessing the keyerror is because of the checking that occurs on the given object, that being whether the object contains that given column or not.

Suggested fix for documentation

I am not sure, maybe just add that the difference is that one checks whether one contains the given object series and the other does not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions