Skip to content

df.to_json segfaults with categorical index #10317

Closed
@sborgeson

Description

@sborgeson

DataFrame.to_json is reliably segfaulting python when the DataFrame has an index of type CategoricalIndex.

import pandas as pd
idx = pd.Categorical([1,2,3], categories=[1,2,3])
df = pd.DataFrame( {  'count' : pd.Series([3,2,2],index=idx) } )
# this will crash python (2.6.X or 2.7.X on linux 64 or win 64 with pandas 0.16.1)
print df.to_json(orient='split')

If I call with orient='index', I get a value error instead:

# this throws a ValueError
print df.to_json(orient='index')
ValueError: Label array sizes do not match corresponding data shape

For what it's worth, my work around, which is acceptable in my application, is to convert my index to strings:

df.index = df.index.astype(str)
print df.to_json(orient='split')

Windows config:

INSTALLED VERSIONS

commit: None
python: 2.7.7.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.16.1
nose: None
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.15.1
statsmodels: None
IPython: 2.1.0
sphinx: None
patsy: None
dateutil: 2.2
pytz: 2014.4
bottleneck: None
tables: 3.1.1
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: 2.0.3
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

linux config:

INSTALLED VERSIONS

commit: None
python: 2.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.18-274.el5
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.1
nose: None
Cython: None
numpy: 1.9.2
scipy: None
statsmodels: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.4
bottleneck: None
tables: 3.2.0
numexpr: 2.4.3
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCategoricalCategorical Data TypeEnhancementError ReportingIncorrect or improved errors from pandasIO JSONread_json, to_json, json_normalize

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions