Skip to content

DataFrame.groupby() interprets tuple as list of keys #17979

Closed
@toobaz

Description

@toobaz

Code Sample, a copy-pastable example if possible

In [2]: df = pd.DataFrame([[1, 2, 3, 4], [3, 4, 5, 6], [1, 4, 2, 3]],
   ...:                           columns=pd.MultiIndex.from_arrays([['a', 'b', 'b', 'c'],
   ...:                                                              [1, 1, 2, 2]]))

In [3]: df.groupby(('b', 1))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-ee2e2124876f> in <module>()
----> 1 df.groupby(('b', 1))

/home/nobackup/repo/pandas/pandas/core/generic.py in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, **kwargs)
   5205         return groupby(self, by=by, axis=axis, level=level, as_index=as_index,
   5206                        sort=sort, group_keys=group_keys, squeeze=squeeze,
-> 5207                        **kwargs)
   5208 
   5209     def asfreq(self, freq, method=None, how=None, normalize=False,

/home/nobackup/repo/pandas/pandas/core/groupby.py in groupby(obj, by, **kwds)
   1757         raise TypeError('invalid type: %s' % type(obj))
   1758 
-> 1759     return klass(obj, by, **kwds)
   1760 
   1761 

/home/nobackup/repo/pandas/pandas/core/groupby.py in __init__(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze, **kwargs)
    390                                                     level=level,
    391                                                     sort=sort,
--> 392                                                     mutated=self.mutated)
    393 
    394         self.obj = obj

/home/nobackup/repo/pandas/pandas/core/groupby.py in _get_grouper(obj, key, axis, level, sort, mutated, validate)
   2861                         sort=sort,
   2862                         in_axis=in_axis) \
-> 2863             if not isinstance(gpr, Grouping) else gpr
   2864 
   2865         groupings.append(ping)

/home/nobackup/repo/pandas/pandas/core/groupby.py in __init__(self, index, grouper, obj, name, level, sort, in_axis)
   2611                 if getattr(self.grouper, 'ndim', 1) != 1:
   2612                     t = self.name or str(type(self.grouper))
-> 2613                     raise ValueError("Grouper for '%s' not 1-dimensional" % t)
   2614                 self.grouper = self.index.map(self.grouper)
   2615                 if not (hasattr(self.grouper, "__len__") and

ValueError: Grouper for 'b' not 1-dimensional

Problem description

('b', 1) is a valid key and should be interpreted as such: instead, it is interpreted as ['b', 1].

This is related to #17977 , but the fix should be pretty easy.

Expected Output

In [4]: df.groupby([('b', 1)])
Out[4]: <pandas.core.groupby.DataFrameGroupBy object at 0x7fa35bf27780>

Output of pd.show_versions()

INSTALLED VERSIONS

commit: b539298
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.21.0rc1+30.gb539298ca
pytest: 3.0.6
pip: 9.0.1
setuptools: None
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 5.1.0.dev
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions