Skip to content

heavy rounding of float32 when transforming raw netcdf to dataframes #3651

Closed
@vkhodygo

Description

@vkhodygo

MCVE Code Sample

import xarray as xr
d = xr.open_dataset('g01_xrs_1m_3s_vc_19780501_19780531.nc')
print(d['xl'])
df = d.to_dataframe()
print(df['xl'])

Expected Output

I expected the same numbers.

Problem Description

Datafile
This issue is somehow similar to #2304.
I don't see any trends here, some files are converted properly whereas some are full of bogus data.
I'm grateful that I don't have to deal with csv files, however...
I clearly understand that this is not rocket science, some precision loss is fine in general.
Nevertheless, when you work with small numbers this can result in completely incorrect results. What's the point in predicting the time series when you have identical values everywhere.

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.0 (default, Oct 23 2019, 18:51:26)
[GCC 9.2.0]
python-bits: 64
OS: Linux
OS-release: 5.4.5-arch1-1
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.3

xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.4
scipy: 1.3.3
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.4.0.dev0+11.g38a0fd0
dask: None
distributed: None
matplotlib: 3.1.1
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 41.6.0
pip: 19.2.3
conda: None
pytest: 5.3.1
IPython: 7.10.2
sphinx: 2.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions