Closed
Description
Code Sample
# Create 4 datasets containing sections of contiguous (x,y) data
for i, x in enumerate([1, 3]):
for j, y in enumerate([10, 40]):
ds = xr.Dataset({'foo': (('x', 'y'), np.ones((2, 3)))},
coords={'x': [x, x+1],
'y': [y, y+10, y+20]})
ds.to_netcdf('ds.' + str(i) + str(j) + '.nc')
# Try to open them all in one go
ds_read = xr.open_mfdataset('ds.*.nc')
print(ds_read)
Problem description
Currently xr.open_mfdataset
will detect a single common dimension and concatenate DataSets along that dimension. However a common use case is a set of NetCDF files which have two or more common dimensions that need to be concatenated along simultaneously (for example collecting the output of any large-scale simulation which parallelizes in more than one dimension simultaneously). For the behaviour of xr.open_mfdataset
to be n-dimensional it should automatically recognise and concatenate along all common dimensions.
Expected Output
<xarray.Dataset>
Dimensions: (x: 4, y: 6)
Coordinates:
* x (x) int64 1 2 3 4
* y (y) int64 10 20 30 40 50 60
Data variables:
foo (x, y) float64 dask.array<shape=(4, 6), chunksize=(2, 3)>
Current output of xr.open_mfdataset()
<xarray.Dataset>
Dimensions: (x: 4, y: 12)
Coordinates:
* x (x) int64 1 2 3 4
* y (y) int64 10 20 30 40 50 60 10 20 30 40 50 60
Data variables:
foo (x, y) float64 dask.array<shape=(4, 12), chunksize=(4, 3)>
Metadata
Metadata
Assignees
Labels
No labels