Closed
Description
What happened?
Currently to_pandas
calls via values
np.asarray
for each column, which ignores the extension data type.
What did you expect to happen?
I would expect the return type to be a pandas extension array.
Minimal Complete Verifiable Example
import xarray as xr
import pandas as pd
assert xr.DataArray(pd.Categorical(["a", "b"])).to_pandas().dtype == "object"
### MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.
### Relevant log output
_No response_
### Anything else we need to know?
My bad!
### Environment
<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.12.3 | packaged by Anaconda, Inc. | (main, May 6 2024, 14:46:42) [Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: None
xarray: 2024.9.0
pandas: 2.2.2
numpy: 2.0.2
scipy: 1.14.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.11.0
zarr: 2.18.3
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.7.1
distributed: 2024.7.1
matplotlib: 3.9.2
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2024.9.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: 0.11.2
setuptools: 75.1.0
pip: 24.0
conda: 24.4.0
pytest: 8.3.3
mypy: None
IPython: 8.26.0
sphinx: 7.3.7
</details>