Closed
Description
This is a continuation of #13056, #14927, and #13056, which were closed by #31613. I think that PR ensured that we consistently take one of two code paths. This issue is to verify that we actually want the behavior on master.
Focusing on a specific pair of examples that differ only in whether the returned index is the same or not:
# master
In [10]: def f(x):
...: return x.copy() # same index
In [11]: def g(x):
...: return x.copy().rename(lambda x: x + 1) # different index
In [12]: df = pd.DataFrame({"A": ['a', 'b'], "B": [1, 2]})
In [13]: df.groupby("A").apply(f)
Out[13]:
A B
0 a 1
1 b 2
In [14]: df.groupby("A").apply(g)
Out[14]:
A B
A
a 1 a 1
b 2 b 2
# 1.0.4
In [8]: df.groupby("A").apply(f)
Out[8]:
A B
A
a 0 a 1
b 1 b 2
In [9]: df.groupby("A").apply(g)
Out[9]:
A B
A
a 1 a 1
b 2 b 2
So the 1.0.4 behavior is to always prepend the group keys to the result as an index level.
In pandas 1.1.0, whether the group keys are prepended depends on whether the udf returns a dataframe with an identical index. Do we want that kind of value-dependent behavior?
@jorisvandenbossche's notebook from #13056 (comment) might be helpful, though it might be out of date.