Description
Firstly, I think xarray
is great and for the type of physics simulations I run n-dimensional labelled arrays is exactly what I need. But, and I may be missing something, is there a way to merge (or concatenate/update) DataArrays with different domains on the same coordinates?
For example consider this setup:
import xarray as xr
x1 = [100]
y1 = [1, 2, 3, 4, 5]
dat1 = [[101, 102, 103, 104, 105]]
x2 = [200]
y2 = [3, 4, 5, 6] # different size and domain
dat2 = [[203, 204, 205, 206]]
da1 = xr.DataArray(dat1, dims=['x', 'y'], coords={'x': x1, 'y': y1})
da2 = xr.DataArray(dat2, dims=['x', 'y'], coords={'x': x2, 'y': y2})
I would like to aggregate such DataArrays into a new, single DataArray with nan
padding such that:
>>> merge(da1, da2, align=True) # made up syntax
<xarray.DataArray (x: 2, y: 6)>
array([[ 101., 102., 103., 104., 105., nan],
[ nan, nan, 203., 204., 205., 206.]])
Coordinates:
* x (x) int64 100 200
* y (y) int64 1 2 3 4 5 6
Here is a quick function I wrote to do such but I would worried about the performance of 'expanding' the new data to the old data's size every iteration (i.e. supposing that the first argument is a large DataArray that you are adding to but doesn't necessarily contain the dimensions already).
def xrmerge(*das, accept_new=True):
da = das[0]
for new_da in das[1:]:
# Expand both to have same dimensions, padding with NaN
da, new_da = xr.align(da, new_da, join='outer')
# Fill NaNs one way or the other re. accept_new
da = new_da.fillna(da) if accept_new else da.fillna(new_da)
return da
Might this be (or is this already!) possible in simpler form in xarray
? I know Datasets have merge
and update
methods but I couldn't make them work as above.
I also notice there are possible plans ( #417 ) to introduce a merge
function for DataArrays.