NetCDF: Not a valid ID when trying to retrieve values from Dask array

Hi, I am attempting to pull values from an xarray dataset to accumulate rainfall at specific times over a large number of dimensions. The dataset, **concat_floods_all**, is as follows:

```
<xarray.Dataset>
Dimensions:  (south_north: 1015, west_east: 1359)
Coordinates:
    XLAT     (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
    XLONG    (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
Dimensions without coordinates: south_north, west_east
```
With 658 variables (all accumulated rainfall at different times over the same domain):

```
Data variables:
    var0     (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
    var1     (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
...
    var658    (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)
```

The issue is when I sum all the variables up, using the following:

`sum_floods = concat_floods_all.sum(skipna = True, dim='variable').compute()`

I get the following error message:

`RuntimeError: NetCDF: Not a valid ID`

Based on #https://github.com/pydata/xarray/issues/1001, I believe this error is due to opening numerous files I search, and then appending them to a list in a for loop (I chose this method over mfdataset, due to combining some files and deleting redundant ones). 

```
var_list = []
### Find files that match start and end year/month of flood (fflood_start_end) and subset based on flood duration

for c, item in enumerate(fflood_start_end):   
glob.glob(os.path.join('/glade2/collections/rda/data/ds612.0/CTRL/**/wrf2d_d01_CTRL_'+var+ '_*')):
        if item in name:
            wrf_match_fflood = xr.open_dataset(name, chunks = {'Time':10})
            # pull only times of floods
            var= wrf_match_fflood[var].sel(Time = slice(date_fflood_st_dt2[c], date_fflood_end_dt2[c]))
            var_list.append(var)
```
I am wondering how to get the actual 1015x1359 array of values for sum_floods and work around this issue. 

Thanks! 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

NetCDF: Not a valid ID when trying to retrieve values from Dask array #2305

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

NetCDF: Not a valid ID when trying to retrieve values from Dask array #2305

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions