Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow to set axis name in pd.concat #56553

Open
1 of 3 tasks
mullimanko opened this issue Dec 18, 2023 · 1 comment
Open
1 of 3 tasks

ENH: Allow to set axis name in pd.concat #56553

mullimanko opened this issue Dec 18, 2023 · 1 comment
Labels
Enhancement Needs Discussion Requires discussion from core team before further action Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@mullimanko
Copy link

mullimanko commented Dec 18, 2023

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

pd.concat() has the parameter names which works fine when using the keys parameter together with the names parameter. For example:

import pandas as pd

df = pd.DataFrame([[1, 2, 3],
                   [10, 10, 10],
                   ], columns=["A", "B", "C"]
                  ).rename_axis("class")

pd.concat([df, df.agg(["sum"])],
            keys=["a_key", "b_key"],
            names=["foo", "class"]
            )

Output:

              A   B   C
foo   class            
a_key 0       1   2   3
      1      10  10  10
b_key sum    11  12  13

The name of index level 1 is now "class" (same as it was before).
However, using the names parameter alone doesn't change anything, no error appears and no changes are made to the name of the index. This is a bit in accordance to the docs, which say to the names parameter:

Names for the levels in the resulting hierarchical index.

Feature Description

I find it more satisfying if names could be used in a single index too, just to set the name of the resulting single index (i.e. rename the index axis).

pd.concat([df, df.agg(["sum"])], names=["class"])

Expected output:

class  A   B   C
0       1   2   3
1      10  10  10
sum    11  12  13

My suggestion is that using the names parameter without the keys parameter would set the name of the resulting axis for the single index (if a single index is returned). On a side note, I don't know what is usally used as argument in such a case names=["class"] or names="class", I think you know more than me.

Alternative Solutions

The following example works right now but only because the name of the index in the first object of pd.concat([df, df.agg(["sum"]) is already "class":
pd.concat([df, df.agg(["sum"]).rename_axis("class")])

Otherwise one would need to use:
pd.concat([df, df.agg(["sum"])]).rename_axis("class")

The proposed solution seems to be more readable:
pd.concat([df, df.agg(["sum"])], names=["class"])

Additional Context

If this would be implemented, the docs for the parameter names would need to be adapted, e.g. to something like:
Names for the levels in the resulting single index or Multiindex.

@mullimanko mullimanko added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 18, 2023
@rhshadrach
Copy link
Member

Thanks for the request, this seems reasonable to me. At least I don't see a reason why it'd be beneficial to ignore the names argument in the case of a non-MultiIndex.

@rhshadrach rhshadrach added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Discussion Requires discussion from core team before further action Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

2 participants