Description
DataFrame.to_json is reliably segfaulting python when the DataFrame has an index of type CategoricalIndex.
import pandas as pd
idx = pd.Categorical([1,2,3], categories=[1,2,3])
df = pd.DataFrame( { 'count' : pd.Series([3,2,2],index=idx) } )
# this will crash python (2.6.X or 2.7.X on linux 64 or win 64 with pandas 0.16.1)
print df.to_json(orient='split')
If I call with orient='index', I get a value error instead:
# this throws a ValueError
print df.to_json(orient='index')
ValueError: Label array sizes do not match corresponding data shape
For what it's worth, my work around, which is acceptable in my application, is to convert my index to strings:
df.index = df.index.astype(str)
print df.to_json(orient='split')
Windows config:
INSTALLED VERSIONS
commit: None
python: 2.7.7.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.16.1
nose: None
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.15.1
statsmodels: None
IPython: 2.1.0
sphinx: None
patsy: None
dateutil: 2.2
pytz: 2014.4
bottleneck: None
tables: 3.1.1
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: 2.0.3
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
linux config:
INSTALLED VERSIONS
commit: None
python: 2.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.18-274.el5
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.16.1
nose: None
Cython: None
numpy: 1.9.2
scipy: None
statsmodels: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.4
bottleneck: None
tables: 3.2.0
numexpr: 2.4.3
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None