groupby with multiindex behaves differently for series and single-column dataframe

This behavior seems strange to me. Starting with a multiindex dataframe with unbalanced levels:

``` python
In [3]: pd.__version__
Out[3]: '0.15.2'

In [4]: df = pd.read_csv(StringIO('''
A,B,x
three,a,1
three,b,2
three,c,3
two,a,4
two,b,5
one,a,6'''), index_col=list('AB'))

In [5]: df
Out[5]: 
         x
A     B   
three a  1
      b  2
      c  3
two   a  4
      b  5
one   a  6
```

Groupby aggregations on this dataframe seem to revert the index to the cross product of the levels, potentially leaving many NAs in the result:

``` python
In [6]: df.groupby(level=[0,1]).mean()
Out[6]: 
          x
A     B    
one   a   6
      b NaN
      c NaN
three a   1
      b   2
      c   3
two   a   4
      b   5
      c NaN
```

But with the series this doesn't occur (no NAs in the result):

``` python
In [7]: df.x.groupby(level=[0,1]).mean()
Out[7]: 
A      B
one    a    6
three  a    1
       b    2
       c    3
two    a    4
       b    5
Name: x, dtype: int64
```

I'm wondering if this is a bug or intended behavior? I haven't been able to find any mention of different behavior in the docs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

groupby with multiindex behaves differently for series and single-column dataframe #9703

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

groupby with multiindex behaves differently for series and single-column dataframe #9703

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions