Skip to content

pivot_table produces inconsistent columns if applied to empty table #21932

Closed
@ischurov

Description

@ischurov

Code Sample, a copy-pastable example if possible

# Part 1
df1 = pd.DataFrame([], columns=['a', 'b', 'value'])
pivot1 = df1.pivot_table(index='a', columns='b', values='value', 
                         aggfunc='count')
print(pivot1.columns)

#Output: 
MultiIndex(levels=[['value'], []],
           labels=[[], []],
           names=[None, 'b'])

# Part 2
df2 = pd.DataFrame([[1, 2, 3]], columns=['a', 'b', 'value'])
pivot2 = df2.pivot_table(index='a', columns='b', values='value', 
                         aggfunc='count')
print(pivot2.columns)

#Output:
Int64Index([2], dtype='int64', name='b')

Problem description

In the first example, I don't expect to see any multiindex in .columns, as exactly one value for columns is provided. As we don't have any data, and it is not possible to figure out the dtype of this index, one can probably assume it's something like Index([], name='b').

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.0.final.0 python-bits: 64 OS: Darwin OS-release: 17.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.23.3
pytest: None
pip: 10.0.1
setuptools: 39.2.0
Cython: None
numpy: 1.14.5
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions