Description
Hi, I am attempting to pull values from an xarray dataset to accumulate rainfall at specific times over a large number of dimensions. The dataset, concat_floods_all, is as follows:
<xarray.Dataset>
Dimensions: (south_north: 1015, west_east: 1359)
Coordinates:
XLAT (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
XLONG (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
Dimensions without coordinates: south_north, west_east
With 658 variables (all accumulated rainfall at different times over the same domain):
Data variables:
var0 (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
var1 (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)>
...
var658 (south_north, west_east) float32 dask.array<shape=(1015, 1359), chunksize=(1015, 1359)
The issue is when I sum all the variables up, using the following:
sum_floods = concat_floods_all.sum(skipna = True, dim='variable').compute()
I get the following error message:
RuntimeError: NetCDF: Not a valid ID
Based on ##1001, I believe this error is due to opening numerous files I search, and then appending them to a list in a for loop (I chose this method over mfdataset, due to combining some files and deleting redundant ones).
var_list = []
### Find files that match start and end year/month of flood (fflood_start_end) and subset based on flood duration
for c, item in enumerate(fflood_start_end):
glob.glob(os.path.join('/glade2/collections/rda/data/ds612.0/CTRL/**/wrf2d_d01_CTRL_'+var+ '_*')):
if item in name:
wrf_match_fflood = xr.open_dataset(name, chunks = {'Time':10})
# pull only times of floods
var= wrf_match_fflood[var].sel(Time = slice(date_fflood_st_dt2[c], date_fflood_end_dt2[c]))
var_list.append(var)
I am wondering how to get the actual 1015x1359 array of values for sum_floods and work around this issue.
Thanks!