Skip to content

[FEA] Properly interface with dask object -> arrow[string] conversions #14915

Open
@galipremsagar

Description

@galipremsagar

Is your feature request related to a problem? Please describe.
dask has a behavioral change when pandas-2.x & pyarrow are installed. Some APIs try switching object type columns to arrow[string] for efficiencies arrow strings provide over object types in pandas. This causes a lot of pytest failures in dask_cudf, that is currently being prevented by configured a dask.config: dataframe.convert-string to False. However, this only happens properly for dask.DataFrame's and not for dask_cudf.DataFrame's.

Describe the solution you'd like
We need to update cudf & dask upstream to get this support for dask_cudf.DataFrame as well when the device object is converted to host objects.

Metadata

Metadata

Assignees

Labels

PythonAffects Python cuDF API.cudf.pandasIssues specific to cudf.pandasdaskDask issuefeature requestNew feature or request

Type

No type

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions