Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample
import pandas as pd
df = pd.DataFrame(
dict(c1=[10.], c2=['a'], c3=pd.to_datetime('2020-01-01')))
# Triggering conditions: multiindex with date, empty dataframe
# Multiindex without date works
df.set_index(['c1', 'c2']).head(0).reset_index()
# Regular index with date also works
df.set_index(['c3']).head(0).reset_index()
# Multiindex with date crashes...
df.set_index(['c2', 'c3']).head(0).reset_index()
# >> ValueError: cannot convert float NaN to integer
# This used to work on pandas 1.0.3, but breaks on pandas 1.1.0
# Though the error doesn't trigger if the dataframe is empty before
# calling set_index()
df.head(0).set_index(['c2', 'c3']).reset_index()
# I originally observed the bug in a groupby call
df.head(0).groupby(['c2', 'c3'])[['c1']].sum().reset_index()
# >> ValueError: cannot convert float NaN to integer
# This used to work on pandas 1.0.3, but breaks on pandas 1.1.0
Problem description
On pandas 1.1.0, I'm getting a ValueError exception when calling dataframe.reset_index() under the following conditions:
- Input dataframe is empty
- Multiindex from multiple columns, at least one of which is a datetime
The exception message is ValueError: cannot convert float NaN to integer
.
Error trace:
Error
Traceback (most recent call last):
df_out.reset_index()
File "/Users/pec21/PycharmProjects/anp_voice_report/virtual/lib/python3.6/site-packages/pandas/core/frame.py", line 4848, in reset_index
level_values = _maybe_casted_values(lev, lab)
File "/Users/pec21/PycharmProjects/anp_voice_report/virtual/lib/python3.6/site-packages/pandas/core/frame.py", line 4782, in _maybe_casted_values
fill_value, len(mask), dtype
File "/Users/pec21/PycharmProjects/anp_voice_report/virtual/lib/python3.6/site-packages/pandas/core/dtypes/cast.py", line 1554, in construct_1d_arraylike_from_scalar
subarr.fill(value)
ValueError: cannot convert float NaN to integer
This error didn't happen on pandas 1.0.3 and earlier. I haven't tested any intermediate releases, nor the master branch.
Expected Output
No exception is raised, returns an empty dataframe.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : d9fff27
python : 3.6.6.final.0
python-bits : 64
OS : Darwin
OS-release : 18.6.0
Version : Darwin Kernel Version 18.6.0: Thu Apr 25 23:16:27 PDT 2019; root:xnu-4903.261.4~2/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_GB.UTF-8
pandas : 1.1.0
numpy : 1.17.4
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 49.1.0
Cython : None
pytest : 5.3.4
hypothesis : None
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.2
html5lib : None
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : None
pandas_datareader: None
bs4 : 4.8.1
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : 0.15.1
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.3.3
sqlalchemy : 1.3.11
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
numba : None