Closed
Description
This behavior seems strange to me. Starting with a multiindex dataframe with unbalanced levels:
In [3]: pd.__version__
Out[3]: '0.15.2'
In [4]: df = pd.read_csv(StringIO('''
A,B,x
three,a,1
three,b,2
three,c,3
two,a,4
two,b,5
one,a,6'''), index_col=list('AB'))
In [5]: df
Out[5]:
x
A B
three a 1
b 2
c 3
two a 4
b 5
one a 6
Groupby aggregations on this dataframe seem to revert the index to the cross product of the levels, potentially leaving many NAs in the result:
In [6]: df.groupby(level=[0,1]).mean()
Out[6]:
x
A B
one a 6
b NaN
c NaN
three a 1
b 2
c 3
two a 4
b 5
c NaN
But with the series this doesn't occur (no NAs in the result):
In [7]: df.x.groupby(level=[0,1]).mean()
Out[7]:
A B
one a 6
three a 1
b 2
c 3
two a 4
b 5
Name: x, dtype: int64
I'm wondering if this is a bug or intended behavior? I haven't been able to find any mention of different behavior in the docs.