Open
Description
Problem description
I construct a Series
in several ways that should give the same output from to_dict()
, but instead I get different output types. In my case, this breaks downstream JSON serializers.
The code sample below includes cases with correct output (bool
) and incorrect (numpy.bool_
) -- see inline comments.
Related issues, though none seem exactly the same: #13258, #13830, #16048, #17491, #19381, #20791, #23753, #23921, #24908, #25969
Code sample
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({ 'a': [True, False], 'b': [0, 1]} )
In [3]: df
Out[3]:
a b
0 True 0
1 False 1
In [27]: type(df['a'].iloc[0])
Out[27]: numpy.bool_
In [48]: type(df[['a']].iloc[0, 0])
Out[48]: numpy.bool_
In [33]: type(df.iloc[0,0])
Out[33]: numpy.bool_
In [24]: type(df.iloc[0]['a'])
Out[24]: numpy.bool_
# ----
In [4]: df[['a']].iloc[0].to_dict()
Out[4]: {'a': True}
# correct
In [5]: type(df[['a']].iloc[0].to_dict()['a'])
Out[5]: bool
In [6]: df.iloc[0][['a']].to_dict()
Out[6]: {'a': True}
# this one is incorrect, should return bool
In [7]: type(df.iloc[0][['a']].to_dict()['a'])
Out[7]: numpy.bool_
# ----
In [8]: df[['a', 'b']].to_dict(orient='records')[0]
Out[8]: {'a': True, 'b': 0}
# correct
In [9]: type(df[['a', 'b']].to_dict(orient='records')[0]['a'])
Out[9]: bool
In [10]: df[['a', 'b']].iloc[0].to_dict()
Out[10]: {'a': True, 'b': 0}
# this one is incorrect, should return bool
In [11]: type(df[['a', 'b']].iloc[0].to_dict()['a'])
Out[11]: numpy.bool_
This may explain what's going on:
In [54]: df.iloc[0][['a']]
Out[54]:
a True
Name: 0, dtype: object
In [56]: df[['a']].iloc[0]
Out[56]:
a True
Name: 0, dtype: bool
That relates to #25969, where @mroeschke commented about a similar dtype discrepancy:
This probably occurs because
s2
is object dtype and it's trying to preserve the dtype of each input argument while the arguments ins1
can both be coerced toint64
.
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Darwin
OS-release : 18.6.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.0
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.0.3
setuptools : 40.8.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.6.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : 2.6.9
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : None
sqlalchemy : None
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None