Skip to content

to_pandas on DataArray with extension array data type #9519

Closed
@ilan-gold

Description

@ilan-gold

What happened?

Currently to_pandas calls via values np.asarray for each column, which ignores the extension data type.

What did you expect to happen?

I would expect the return type to be a pandas extension array.

Minimal Complete Verifiable Example

import xarray as xr
import pandas as pd
assert xr.DataArray(pd.Categorical(["a", "b"])).to_pandas().dtype == "object"


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

My bad!

### Environment

<details>


INSTALLED VERSIONS
------------------
commit: None
python: 3.12.3 | packaged by Anaconda, Inc. | (main, May  6 2024, 14:46:42) [Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: None

xarray: 2024.9.0
pandas: 2.2.2
numpy: 2.0.2
scipy: 1.14.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.11.0
zarr: 2.18.3
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.7.1
distributed: 2024.7.1
matplotlib: 3.9.2
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2024.9.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: 0.11.2
setuptools: 75.1.0
pip: 24.0
conda: 24.4.0
pytest: 8.3.3
mypy: None
IPython: 8.26.0
sphinx: 7.3.7

</details>

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions