Skip to content

DEPR: dropping nuisance columns in DataFrame reductions #41480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 21, 2021
Merged
Prev Previous commit
Next Next commit
whatsnew
  • Loading branch information
jbrockmendel committed May 17, 2021
commit fd03e6b0d16b9da84640af8268c9c63d7d6b7845
39 changes: 39 additions & 0 deletions doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -649,6 +649,45 @@ Deprecations
- Deprecated behavior of :meth:`DatetimeIndex.union` with mixed timezones; in a future version both will be cast to UTC instead of object dtype (:issue:`39328`)
- Deprecated using ``usecols`` with out of bounds indices for ``read_csv`` with ``engine="c"`` (:issue:`25623`)

.. _whatsnew_130.deprecations.nuisance_columns:

Deprecated Dropping Nuisance Columns in DataFrame Reductions and DataFrameGroupBy Operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with
The default of calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with
``numeric_only=None`` will silently ignore and drop from the result nuiscance columns, e.g. a string column in a .mean() reduction.

``numeric_only=None`` (the default, columns on which the reduction raises ``TypeError``
are silently ignored and dropped from the result. This behavior is deprecated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start a new paragraph with 'This behavior is deprecated'

In a future version, the ``TypeError`` will be raised, and users will need to
select only valid columns before calling the function.

For example:

.. ipython:: python

df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})

*Old behavior*:

.. code-block:: ipython

In [3]: df.prod()
Out[3]:
Out[3]:
A 24
dtype: int64

*Future behavior*:

.. code-block:: ipython

In [4]: df.prod()
...
TypeError: 'DatetimeArray' does not implement reduction 'prod'

In [5]: df[["A"]].prod()
Out[5]:
A 24
dtype: int64

.. ---------------------------------------------------------------------------


Expand Down