Skip to content

XArray does not support Latin characters in netCDF file names #9282

Open
@devos0024

Description

@devos0024

What happened?

When you try to open an existing netCDF file named "bépo.nc", for example, you get the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'E:\\temp\\bépo.nc'

What did you expect to happen?

Internally, netCDF transforms the file path into an array of bytes.
This transformation can be configured by setting the appropriate encoding in netCDF4.Dataset constructor
What I expect is to be able to pass this encoding to netCDF when the file is opened with xarray.

Perhaps the solution would be to send the encoding along with the backend_kwargs parameter :

dataset = xr.open_dataset(r"E:\temp\bépo.nc", mode="r", engine="netcdf4", backend_kwargs={'encoding': 'latin-1'})

Transmitting the encoding would also be necessary in the to_netcdf() function.

Minimal Complete Verifiable Example

import os
import tempfile as tmp

import netCDF4 as nc
import xarray as xr

if __name__ == "__main__":
    with tmp.TemporaryDirectory() as temp_dir:
        # Creating a netCDF file
        tmp_folder = os.path.join(temp_dir, "bèpo")
        os.mkdir(tmp_folder)
        file_path = os.path.join(tmp_folder, "bépo.nc")
        print(f"Try to created {file_path}")
        with nc.Dataset(file_path, mode="w", encoding="Latin-1") as ds:
            print(f"{file_path} successfully created")

        # Open with netCDF
        with nc.Dataset(file_path, mode="r", encoding="Latin-1"):
            print(f"{file_path} successfully opened with netCDF")

        # Open with xarray
        with xr.open_dataset(file_path, mode="r", engine="netcdf4") as xr_ds:
            print(f"{file_path} successfully opened")

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

E:\temp\bépo.nc successfully created
E:\temp\bépo.nc successfully opened with netCDF
Traceback (most recent call last):
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\file_manager.py", line 211, in _acquire_with_cache_info
    file = self._cache[self._key]
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\lru_cache.py", line 56, in __getitem__
    value = self._cache[key]
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('E:\\temp\\bépo.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), '0916cd5f-58f9-49df-8e74-6c9b109a77cf']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "e:\temp\test_open_dataset.py", line 19, in <module>
    with xr.open_dataset(file_path, mode="r", engine="netcdf4") as xr_ds:
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\api.py", line 571, in open_dataset
    backend_ds = backend.open_dataset(
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\netCDF4_.py", line 645, in open_dataset
    store = NetCDF4DataStore.open(
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\netCDF4_.py", line 408, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\netCDF4_.py", line 355, in __init__
    self.format = self.ds.data_model
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\netCDF4_.py", line 417, in ds
    return self._acquire()
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\netCDF4_.py", line 411, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "E:\Tools\Anaconda3\envs\myenv\lib\contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\file_manager.py", line 199, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "E:\Tools\Anaconda3\envs\myenv\lib\site-packages\xarray\backends\file_manager.py", line 217, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src\\netCDF4\\_netCDF4.pyx", line 2469, in netCDF4._netCDF4.Dataset.__init__
  File "src\\netCDF4\\_netCDF4.pyx", line 2028, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: 'E:\\temp\\bépo.nc'

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:01:37) [MSC v.1935 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('fr_FR', 'cp1252')
libhdf5: 1.14.0
libnetcdf: 4.9.1

xarray: 2024.6.0
pandas: 2.2.2
numpy: 1.26.4
scipy: 1.13.1
netCDF4: 1.6.3
pydap: None
h5netcdf: None
h5py: 3.9.0
zarr: None
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: 1.4.0
dask: 2024.7.1
distributed: 2024.7.1
matplotlib: 3.9.1
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2024.6.1
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 71.0.4
pip: 24.0
conda: 24.5.0
pytest: 8.3.2
mypy: None
IPython: None
sphinx: 7.4.7

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions