Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: fix PR02 errors in docstring for pandas.core.groupby.SeriesGroupBy.apply #57288

Merged
merged 5 commits into from
Feb 8, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
pandas.Timedelta.resolution\
pandas.Interval\
pandas.Grouper\
pandas.core.groupby.SeriesGroupBy.apply\
pandas.core.groupby.DataFrameGroupBy.nth\
pandas.core.groupby.DataFrameGroupBy.rolling\
pandas.core.groupby.SeriesGroupBy.nth\
Expand Down
112 changes: 106 additions & 6 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@
GroupByPlot,
_agg_template_frame,
_agg_template_series,
_apply_docs,
_transform_template,
)
from pandas.core.indexes.api import (
Expand Down Expand Up @@ -214,12 +213,113 @@ def _get_data_to_aggregate(
"""
)

@Appender(
_apply_docs["template"].format(
input="series", examples=_apply_docs["series_examples"]
)
)
def apply(self, func, *args, **kwargs) -> Series:
"""
Apply function ``func`` group-wise and combine the results together.

The function passed to ``apply`` must take a series as its first
argument and return a DataFrame, Series or scalar. ``apply`` will
then take care of combining the results back together into a single
dataframe or series. ``apply`` is therefore a highly flexible
grouping method.

While ``apply`` is a very flexible method, its downside is that
using it can be quite a bit slower than using more specific methods
like ``agg`` or ``transform``. Pandas offers a wide range of method that will
be much faster than using ``apply`` for their specific purposes, so try to
use them before reaching for ``apply``.

Parameters
----------
func : callable
A callable that takes a series as its first argument, and
returns a dataframe, a series or a scalar. In addition the
callable may take positional and keyword arguments.

*args : tuple
Optional positional arguments to pass to ``func``.

**kwargs : dict
Optional keyword arguments to pass to ``func``.

Returns
-------
Series or DataFrame

See Also
--------
pipe : Apply function to the full GroupBy object instead of to each
group.
aggregate : Apply aggregate function to the GroupBy object.
transform : Apply function column-by-column to the GroupBy object.
Series.apply : Apply a function to a Series.
DataFrame.apply : Apply a function to each row or column of a DataFrame.

Notes
-----

.. versionchanged:: 1.3.0

The resulting dtype will reflect the return value of the passed ``func``,
see the examples below.

Functions that mutate the passed object can produce unexpected
behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`
for more details.

Examples
--------
>>> s = pd.Series([0, 1, 2], index='a a b'.split())
>>> g1 = s.groupby(s.index, group_keys=False)
>>> g2 = s.groupby(s.index, group_keys=True)

From ``s`` above we can see that ``g`` has two groups, ``a`` and ``b``.
Notice that ``g1`` have ``g2`` have two groups, ``a`` and ``b``, and only
differ in their ``group_keys`` argument. Calling `apply` in various ways,
we can get different grouping results:

Example 1: The function passed to `apply` takes a Series as
its argument and returns a Series. `apply` combines the result for
each group together into a new Series.

.. versionchanged:: 1.3.0

The resulting dtype will reflect the return value of the passed ``func``.

>>> g1.apply(lambda x: x * 2 if x.name == 'a' else x / 2)
a 0.0
a 2.0
b 1.0
dtype: float64

In the above, the groups are not part of the index. We can have them included
by using ``g2`` where ``group_keys=True``:

>>> g2.apply(lambda x: x * 2 if x.name == 'a' else x / 2)
a a 0.0
a 2.0
b b 1.0
dtype: float64

Example 2: The function passed to `apply` takes a Series as
its argument and returns a scalar. `apply` combines the result for
each group together into a Series, including setting the index as
appropriate:

>>> g1.apply(lambda x: x.max() - x.min())
a 1
b 0
dtype: int64

The ``group_keys`` argument has no effect here because the result is not
like-indexed (i.e. :ref:`a transform <groupby.transform>`) when compared
to the input.

>>> g2.apply(lambda x: x.max() - x.min())
a 1
b 0
dtype: int64
"""
return super().apply(func, *args, **kwargs)

@doc(_agg_template_series, examples=_agg_examples_doc, klass="Series")
Expand Down
Loading
Loading