Skip to content

BUG: pivot_table downcasting dtypes even if not necessary #47971

Closed
@phofl

Description

@phofl

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df = pd.DataFrame({"x": "a", "y": "b", "age": [20, 40]})
result = df.pivot_table(
    index='x', columns='y', values='age', aggfunc='mean', dropna=True
)

result = df.pivot_table(
    index='x', columns='y', values='age', aggfunc='mean', dropna=False
)

Issue Description

with dropna=True this returns int64 dtype, while we get float64 with dropna=False. This happens because we try to downcast if we set dropna, because we drop all nan rows which cast our dtypes to float.

But the downcast path is also hit, when we don't have all nan rows and hence the aggregation function returned the correct dtype all along.

Expected Behavior

I think both cases should be consistent if no nans are dropped, e.g. we should not try to downcast.

If we want to do this, we should probably deprecate or changing in 2.0, but not in a minor release

Installed Versions

main

Metadata

Metadata

Assignees

Labels

BugNeeds TestsUnit test(s) needed to prevent regressionsReshapingConcat, Merge/Join, Stack/Unstack, Explodegood first issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions