SegFaults in the sample data tests with `netCDF4=1.6.1`

 @ESMValGroup/technical-lead-development-team I need your (rather quick) input here please: we have seen that the new netCDF4=1.6.1 is causing frequent segfaults in our CI tests, and we have pinned it to !=1.6.1 to brush the problem under the carpet for us. However, the good folk at Unidata/netCDF4 are scratching their heads and are wondering what the heck's going on, let's try and help them figure that out, even if we can provide a bit of a narrowed-down picture, it' still helpful. For that, I have opened

- https://github.com/Unidata/netcdf4-python/issues/1192
- https://github.com/conda-forge/netcdf4-feedstock/issues/141

(have a read through the discussion there, it's a lot of paint thrown at a white wall)

**and** I have managed to isolate our side of the problem to the sample data testing. Simplifying the problem, this is how the toy model looks:

```python
import iris
import numpy as np
import pickle
import platform
import pytest

TEST_REVISION = 1

def get_cache_key(value):
    """Get a cache key that is hopefully unique enough for unpickling.

    If this doesn't avoid problems with unpickling the cached data,
    manually clean the pytest cache with the command `pytest --cache-clear`.
    """
    py_version = platform.python_version()
    return (f'{value}_iris-{iris.__version__}_'
            f'numpy-{np.__version__}_python-{py_version}'
            f'rev-{TEST_REVISION}')


@pytest.fixture(scope="module")
def timeseries_cubes_month(request):
    """Load representative timeseries data."""
    # cache the cubes to save about 30-60 seconds on repeat use
    cache_key = get_cache_key("sample_data/monthly")
    data = request.config.cache.get(cache_key, None)
    cubes = pickle.loads(data.encode('latin1'))

    return cubes


# @pytest.mark.skip
def test_io_1(timeseries_cubes_month):
    cubes = timeseries_cubes_month
    _ = [c.data for c in cubes]  # this produces SegFaults


@pytest.mark.skip
def test_io_2(timeseries_cubes_month):
    cubes = timeseries_cubes_month
    loaded_cubes = []
    for i, c in enumerate(cubes):
        iris.save(c, str(i) + ".nc")
        lc = iris.load_cube(str(i) + ".nc")
        loaded_cubes.append(lc)
    _ = [c.data for c in loaded_cubes]  # this doesn't produce SegFaults
```

From my tests I found out `test_io_1` has a tendency to produce segfaults at that listcomp step (can test with `-n 0` or `-n 2`, doesn't really matter), whereas the other doesn't. Can we gauge anything from that without digging in the actual IO/threading (that is not our plot of land anyway)? Hive mind, folks! :bee: 

**UPDATE** as of 20-Oct-2022 https://github.com/ESMValGroup/ESMValCore/issues/1727#issuecomment-1285709710

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SegFaults in the sample data tests with `netCDF4=1.6.1` #1727

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SegFaults in the sample data tests with netCDF4=1.6.1 #1727

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

SegFaults in the sample data tests with `netCDF4=1.6.1` #1727