Description
What happened?
I ran into this after updating from xarray 2025.03.0 to 2025.04.0. It seems that xarray has difficulty representing datasets created from a geopandas dataframe. Calling the .to_xarray()
method on a geopandas dataframe creates an array with a custom geopandas.array.GeometryDtype
. The representer calls the .nbytes
property, which errors.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File ".\.pixi\envs\default\Lib\site-packages\xarray\core\dataset.py", line 2291, in __repr__
return formatting.dataset_repr(self)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File ".\.pixi\envs\default\Lib\reprlib.py", line 21, in wrapper
result = user_function(self)
File ".\.pixi\envs\default\Lib\site-packages\xarray\core\formatting.py", line 731, in dataset_repr
nbytes_str = render_human_readable_nbytes(ds.nbytes)
^^^^^^^^^
File ".\.pixi\envs\default\Lib\site-packages\xarray\core\dataset.py", line 1222, in nbytes
return sum(v.nbytes for v in self.variables.values())
File ".\.pixi\envs\default\Lib\site-packages\xarray\core\dataset.py", line 1222, in <genexpr>
return sum(v.nbytes for v in self.variables.values())
^^^^^^^^
File ".\.pixi\envs\default\Lib\site-packages\xarray\namedarray\core.py", line 490, in nbytes
raise TypeError(
"cannot compute the number of bytes (no array API nor nbytes / itemsize)"
)
TypeError: cannot compute the number of bytes (no array API nor nbytes / itemsize)
What did you expect to happen?
With xarray version 2025.03.0, I get:
<xarray.Dataset> Size: 72B
Dimensions: (index: 3)
Coordinates:
* index (index) int64 24B 0 1 2
Data variables:
data (index) float64 24B 1.0 2.0 3.0
geometry (index) geometry 24B <class 'xarray.core.extension_array.Pandas...
In this version the .nbytes
property works with geopandas' GeometryDtype.
Minimal Complete Verifiable Example
import geopandas as gpd
from shapely.geometry import Polygon
# Generate some example data, copied from geopandas README.
p1 = Polygon([(0, 0), (1, 0), (1, 1)])
p2 = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
p3 = Polygon([(2, 0), (3, 0), (3, 1), (2, 1)])
g = gpd.GeoSeries([p1, p2, p3])
gdf = gpd.GeoDataFrame({'data': [1.0, 2.0, 3.0], 'geometry': g})
ds = gdf.to_xarray()
print(ds.__repr__())
print(ds["geometry"].variable.nbytes)
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
Anything else we need to know?
I'm not entirely sure if creating Datasets from geopandas GeoDataframes without any problems is within scope of xarray. I think inferring from this #10301 (comment) it is? Both xarray and geopandas are commonly used in the geospatial community (a comment of somebody who agrees with me: #10301 (comment)).
Environment
In this env, it doesn't work:
INSTALLED VERSIONS
commit: None
python: 3.13.3 | packaged by conda-forge | (main, Apr 14 2025, 20:31:24) [MSC v.1943 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 11
machine: AMD64
processor: Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('Dutch_Netherlands', '1252')
libhdf5: None
libnetcdf: None
xarray: 2025.4.0
pandas: 2.2.3
numpy: 2.2.6
scipy: 1.15.2
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.10.3
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 80.8.0
pip: None
conda: None
pytest: None
mypy: None
IPython: 9.2.0
sphinx: None
In this env, it does work:
INSTALLED VERSIONS
commit: None
python: 3.13.3 | packaged by conda-forge | (main, Apr 14 2025, 20:31:24) [MSC v.1943 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 11
machine: AMD64
processor: Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('Dutch_Netherlands', '1252')
libhdf5: None
libnetcdf: None
xarray: 2025.3.0
pandas: 2.2.3
numpy: 2.2.6
scipy: 1.15.2
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.10.3
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 80.8.0
pip: None
conda: None
pytest: None
mypy: None
IPython: 9.2.0
sphinx: None