With an external grouper, there is no way to access the grouped value in a DataFrame(...).groupby(...).apply(...) workflow #9545
Description
groupby-apply workflows are important pandas idioms. Here's a brief example grouping on a named DataFrame column:
>>> df = pd.DataFrame({'key': [1, 1, 1, 2, 2, 2, 3, 3, 3], 'value': range(9)})
>>> result = df.groupby('key').apply(lambda x: x['key'])
>>> result
key
1 0 1
1 1
2 1
2 3 2
4 2
5 2
3 6 3
7 3
8 3
Name: key, dtype: int64
An important highlight of this example is the ability to reference the grouped value -- eg, x['key']
-- inside the applied function.
pandas also supports grouping on arbitrary mapping functions, iterables, and lots of other objects. In these cases, the grouped value is not represented as a named column in the DataFrame. Thus, when using apply(...), there is no apparent way to access the group key value. The only alternative is to use a (slow) for-loop solution as in:
foo = lambda _k, _g: ...
grouped = df.groupby(grouper)
result_iter = (foo(key, group) for key, group in grouped)
key_iter = (key for key, group in grouped)
pd.DataFrame.from_records(result_iter, index=key_iter)
IMHO, the ability to access the grouped value in an idiomatic way from within the applied function is ergonomically important; the groupby-apply idiom is at best partially realized without it.