Skip to content

BUG: Instantiating pandas._libs.parsers.TextReader segfaults #53131

Open
@JelleZijlstra

Description

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas
pandas._libs.parsers.TextReader()

Issue Description

Instantiating the internal pandas._libs.parsers.TextReader class with no arguments leads to a segfault. I found this while trying to instantiate all classes present in my program (which may have not been a great idea). Reproducible on both 1.1.5 and the latest release (2.0.1).

Here is a debugger trace on my Mac:

(lldb) target create "/Users/jelle/py/venvs/py311/bin/python"
Current executable set to '/Users/jelle/py/venvs/py311/bin/python' (arm64).
(lldb) r
Process 48453 launched: '/Users/jelle/py/venvs/py311/bin/python' (arm64)
Python 3.11.1 (main, Dec 21 2022, 16:19:04) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> pandas._libs.parsers.TextReader()
Process 48453 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x130)
    frame #0: 0x0000000149d1235c parsers.cpython-311-darwin.so`parser_cleanup + 16
parsers.cpython-311-darwin.so`parser_cleanup:
->  0x149d1235c <+16>: ldr    x0, [x0, #0x130]
    0x149d12360 <+20>: cbz    x0, 0x149d1236c           ; <+32>
    0x149d12364 <+24>: bl     0x149d15d5c               ; symbol stub for: free
    0x149d12368 <+28>: str    xzr, [x19, #0x130]
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x130)
  * frame #0: 0x0000000149d1235c parsers.cpython-311-darwin.so`parser_cleanup + 16
    frame #1: 0x0000000149cf0134 parsers.cpython-311-darwin.so`__pyx_tp_dealloc_6pandas_5_libs_7parsers_TextReader + 84
    frame #2: 0x0000000149cf1a0c parsers.cpython-311-darwin.so`__pyx_tp_new_6pandas_5_libs_7parsers_TextReader + 4164
    frame #3: 0x00000001000b3550 python`type_call + 84
    frame #4: 0x000000010004c834 python`_PyObject_MakeTpCall + 344
    frame #5: 0x000000010013cf1c python`_PyEval_EvalFrameDefault + 34740
    frame #6: 0x0000000100134598 python`PyEval_EvalCode + 272
    frame #7: 0x000000010018c5a8 python`PyRun_InteractiveOneObjectEx + 744
    frame #8: 0x000000010018bad0 python`_PyRun_InteractiveLoopObject + 152
    frame #9: 0x000000010018b97c python`_PyRun_AnyFileObject + 96
    frame #10: 0x000000010018c234 python`PyRun_AnyFileExFlags + 68
    frame #11: 0x00000001001ac1dc python`Py_RunMain + 2196
    frame #12: 0x00000001001ac424 python`pymain_main + 324
    frame #13: 0x00000001001ac4c4 python`Py_BytesMain + 40
    frame #14: 0x000000010057508c dyld`start + 520

Expected Behavior

I'd be fine with this throwing an error, but it shouldn't segfault.

Installed Versions

>>> pandas.show_versions()
/Users/jelle/py/venvs/py311/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS
------------------
commit           : 37ea63d540fd27274cad6585082c91b1283f963d
python           : 3.11.1.final.0
python-bits      : 64
OS               : Darwin
OS-release       : 21.6.0
Version          : Darwin Kernel Version 21.6.0: Sun Nov  6 23:31:13 PST 2022; root:xnu-8020.240.14~1/RELEASE_ARM64_T6000
machine          : arm64
processor        : arm
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 2.0.1
numpy            : 1.24.2
pytz             : 2023.3
dateutil         : 2.8.2
setuptools       : 67.7.2
pip              : 23.0.1
Cython           : 0.29.32
pytest           : 7.2.1
hypothesis       : 6.66.0
sphinx           : 6.1.3
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.9.2
html5lib         : 1.1
pymysql          : 1.0.2
psycopg2         : None
jinja2           : 3.1.2
IPython          : 8.7.0
pandas_datareader: None
bs4              : 4.11.2
bottleneck       : None
brotli           : None
fastparquet      : None
fsspec           : None
gcsfs            : None
matplotlib       : None
numba            : None
numexpr          : None
odfpy            : None
openpyxl         : 3.1.2
pandas_gbq       : None
pyarrow          : None
pyreadstat       : None
pyxlsb           : None
s3fs             : None
scipy            : 1.10.1
snappy           : None
sqlalchemy       : 1.4.47
tables           : None
tabulate         : 0.9.0
xarray           : None
xlrd             : None
zstandard        : 0.20.0
tzdata           : 2023.3
qtpy             : None
pyqt5            : None

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvInternalsRelated to non-user accessible pandas implementation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions