-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
df1 = pd.DataFrame([123, 456], columns=['data'], index=[True, False])
# data
# True 123
# False 456
df2 = pd.DataFrame([55, 983, 69, 112, 0], columns=['data'], index=[1, 2, 3, 4, 99])
# data
# 1 55
# 2 983
# 3 69
# 4 112
# 99 0
my_dict = {'One': df1, 'Two':df2}
df_combined = pd.concat(my_dict)
# data
# One True 123
# False 456
# Two True 55 <-------- this index should be "1" instead of True
# 2 983
# 3 69
# 4 112
# 99 0
Problem description
When concatenating a bool-indexed dataset with an int-indexed dataset, concat() uses whichever value it saw first (True or 1; False or 0) instead of 0 or 1 in the hierarchical index.
A less surprising behavior would be to preserve the original indices.
Is there a workaround?
Expected Output
df_combined:
One True 123
False 456
Two 1 55 <--------
2 983
3 69
4 112
99 0
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.22.0
pytest: 3.3.2
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.14.3
scipy: 1.0.1
pyarrow: 0.7.1
xarray: None
IPython: 6.3.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.0
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None