-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
We use the grib2io backend to read GRIB2 formatted files. Started to have problem printing the summary of the dataset to the screen with the v2024.02.0 release. I suspect the problem is from #8702
Trying to print a dataset will fail trying to find nbytes.
The grib2io backend opens the file lazily, which means you are printing the summary of a MemoryCachedArray which doesn't have nbytes, nor is able to calculate.
Loading the data into memory and then the print(ds1) works fine.
import xarray as xr
filters = {
"productDefinitionTemplateNumber": 0,
"typeOfFirstFixedSurface": 1,
"shortName": "TMP",
}
ds1 = xr.open_dataset(
"gfs_20221107/gfs.t00z.pgrb2.1p00.f012_subset",
engine="grib2io",
filters=filters,
)
print(ds1)
TypeError Traceback (most recent call last)
Cell In[6], line 1
----> 1 print(ds1)
File ~/anaconda3/envs/default311/lib/python3.11/site-packages/xarray/core/dataset.py:2569, in Dataset.__repr__(self)
2568 def __repr__(self) -> str:
-> 2569 return formatting.dataset_repr(self)
File ~/anaconda3/envs/default311/lib/python3.11/reprlib.py:21, in recursive_repr.<locals>.decorating_function.<locals>.wrapper(self)
19 repr_running.add(key)
20 try:
---> 21 result = user_function(self)
22 finally:
23 repr_running.discard(key)
File ~/anaconda3/envs/default311/lib/python3.11/site-packages/xarray/core/formatting.py:717, in dataset_repr(ds)
715 @recursive_repr("<recursive Dataset>")
716 def dataset_repr(ds):
--> 717 nbytes_str = render_human_readable_nbytes(ds.nbytes)
718 summary = [f"<xarray.{type(ds).__name__}> Size: {nbytes_str}"]
720 col_width = _calculate_col_width(ds.variables)
File ~/anaconda3/envs/default311/lib/python3.11/site-packages/xarray/core/dataset.py:1544, in Dataset.nbytes(self)
1536 @property
1537 def nbytes(self) -> int:
1538 """
1539 Total bytes consumed by the data arrays of all variables in this dataset.
1540
1541 If the backend array for any variable does not include ``nbytes``, estimates
1542 the total bytes for that array based on the ``size`` and ``dtype``.
1543 """
-> 1544 return sum(v.nbytes for v in self.variables.values())
File ~/anaconda3/envs/default311/lib/python3.11/site-packages/xarray/core/dataset.py:1544, in <genexpr>(.0)
1536 @property
1537 def nbytes(self) -> int:
1538 """
1539 Total bytes consumed by the data arrays of all variables in this dataset.
1540
1541 If the backend array for any variable does not include ``nbytes``, estimates
1542 the total bytes for that array based on the ``size`` and ``dtype``.
1543 """
-> 1544 return sum(v.nbytes for v in self.variables.values())
File ~/anaconda3/envs/default311/lib/python3.11/site-packages/xarray/namedarray/core.py:491, in NamedArray.nbytes(self)
489 itemsize = xp.finfo(self.dtype).bits // 8
490 else:
--> 491 raise TypeError(
492 "cannot compute the number of bytes (no array API nor nbytes / itemsize)"
493 )
495 return self.size * itemsize
TypeError: cannot compute the number of bytes (no array API nor nbytes / itemsize)
You can force loading the data and then printing works:
print(ds1["TMP"].values[0][0])
253.28014
print(ds1)
<xarray.Dataset> Size: 1MB
Dimensions: (y: 181, x: 360)
Coordinates:
refDate datetime64[ns] 8B ...
leadTime timedelta64[ns] 8B ...
valueOfFirstFixedSurface float64 8B ...
latitude (y, x) float64 521kB ...
longitude (y, x) float64 521kB ...
validDate datetime64[ns] 8B ...
Dimensions without coordinates: y, x
Data variables:
TMP (y, x) float32 261kB 253.3 253.3 ... 240.2 240.2
Attributes:
engine: grib2io
What did you expect to happen?
Want print(ds1) to print the summary of the dataset.
<xarray.Dataset> Size: 1MB
Dimensions: (y: 181, x: 360)
Coordinates:
refDate datetime64[ns] 8B ...
leadTime timedelta64[ns] 8B ...
valueOfFirstFixedSurface float64 8B ...
latitude (y, x) float64 521kB ...
longitude (y, x) float64 521kB ...
validDate datetime64[ns] 8B ...
Dimensions without coordinates: y, x
Data variables:
TMP (y, x) float32 261kB 253.3 253.3 ... 240.2 240.2
Attributes:
engine: grib2io
Minimal Complete Verifiable Example
# You have to download the GRIB2 file from
"""
https://github.com/NOAA-MDL/grib2io/blob/master/tests/data/gfs_20221107/gfs.t00z.pgrb2.1p00.f012_subset
"""
import xarray as xr
filters = {
"productDefinitionTemplateNumber": 0,
"typeOfFirstFixedSurface": 1,
"shortName": "TMP",
}
ds1 = xr.open_dataset(
"gfs_20221107/gfs.t00z.pgrb2.1p00.f012_subset",
engine="grib2io",
filters=filters,
)
print(ds1)
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
No response
Anything else we need to know?
No response
Environment
xarray: 2024.6.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.12.0
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
zarr: 2.17.1
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.3.1
distributed: 2024.3.1
matplotlib: 3.8.4
cartopy: 0.22.0
seaborn: None
numbagg: None
fsspec: 2024.3.1
cupy: None
pint: 0.23
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.2.0
pip: 24.0
conda: 24.3.0
pytest: 8.1.1
mypy: None
IPython: 8.22.2
sphinx: 7.3.7