Skip to content

[Python] Conversion to/from numpy 2.0+ new StringDType #42018

@jorisvandenbossche

Description

@jorisvandenbossche

With the upcoming numpy 2.0 release, they introduce a new variable-length string dtype, StringDType (https://numpy.org/devdocs/release/2.0.0-notes.html#highlights).

Currently trying to convert that obviously gives an error:

>>> arr = np.array(["some", "strings"], dtype=np.dtypes.StringDType())
>>> arr
array(['some', 'strings'], dtype=StringDType())
>>> pa.array(arr)
...
ArrowNotImplementedError: Unsupported numpy type 2056

But ideally we should support this dtype both for conversion from numpy -> arrow, as in the arrow -> numpy conversion (although here it should probably be opt-in and by default still return object dtype)

Numpy provides a C API to access the individual string elements: https://numpy.org/devdocs/reference/c-api/strings.html

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions