Skip to content

BUG: API Inconsistency between numeric_only and select_dtypes(["number"]) #58210

Open
@WillAyd

Description

@WillAyd

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [25]: df = pd.DataFrame({"bool": [True, False, True], "int": [1, 2, 3]})

In [26]: df.sum(numeric_only=True)
Out[26]: 
bool    2
int     6
dtype: int64

In [27]: df.select_dtypes(include=["number"]).sum()
Out[27]: 
int    6
dtype: int64


### Issue Description

`numeric_only=True` includes boolean values, whereas `select_dtypes(include=["number"])` does not

### Expected Behavior

If judging by the NumPy type hierarchy the latter is more correct

https://numpy.org/doc/stable/reference/arrays.scalars.html

### Installed Versions

'3.0.0.dev0+681.g434fda08cf'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions