Skip to content

NetCDF: Not a valid ID when trying to retrieve values from Dask array #2305

Closed
@edougherty32

Description

@edougherty32

Hi, I am attempting to pull values from an xarray dataset to accumulate rainfall at specific times over a large number of dimensions. The dataset, concat_floods_all, is as follows:

<xarray.Dataset>
Dimensions:  (south_north: 1015, west_east: 1359)
Coordinates:
    XLAT     (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
    XLONG    (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
Dimensions without coordinates: south_north, west_east

With 658 variables (all accumulated rainfall at different times over the same domain):

Data variables:
    var0     (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
    var1     (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
...
    var658    (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)

The issue is when I sum all the variables up, using the following:

sum_floods = concat_floods_all.sum(skipna = True, dim='variable').compute()

I get the following error message:

RuntimeError: NetCDF: Not a valid ID

Based on ##1001, I believe this error is due to opening numerous files I search, and then appending them to a list in a for loop (I chose this method over mfdataset, due to combining some files and deleting redundant ones).

var_list = []
### Find files that match start and end year/month of flood (fflood_start_end) and subset based on flood duration

for c, item in enumerate(fflood_start_end):   
glob.glob(os.path.join('/glade2/collections/rda/data/ds612.0/CTRL/**/wrf2d_d01_CTRL_'+var+ '_*')):
        if item in name:
            wrf_match_fflood = xr.open_dataset(name, chunks = {'Time':10})
            # pull only times of floods
            var= wrf_match_fflood[var].sel(Time = slice(date_fflood_st_dt2[c], date_fflood_end_dt2[c]))
            var_list.append(var)

I am wondering how to get the actual 1015x1359 array of values for sum_floods and work around this issue.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions