Closed
Description
Tested on xarray 0.15.1, sparse 0.9.1 and xarray 0.15.2.dev50+g3820fb77, sparse-0.9.1+26.ga9a6de2:
# Random sparse sample data
idx = pd.MultiIndex.from_product([np.arange(100), np.arange(100)], names=['a', 'b'])
s = pd.Series(np.random.RandomState(0).random(len(idx)), index=idx).sample(n=500, random_state=0)
da_dense = xr.DataArray.from_series(s, sparse=False)
da_sparse = xr.DataArray.from_series(s, sparse=True)
key = 23
print("Total:", da_dense.sum().values, da_sparse.sum().values)
print(f"loc[key]:", da_dense.loc[key].sum().values, da_sparse.loc[key].sum().values)
print(f"[key]:", da_dense[key].sum().values, da_sparse[key].sum().values)
print(f".isel(key):", da_dense.isel({'a': key}).sum().values, da_sparse.isel({'a': key}).sum().values)
print(f".sel(key):", da_dense.sel({'a': key}).sum().values, da_sparse.sel({'a': key}).sum().values)
Output:
Total: 253.0721848728631 253.07218487286306
loc[key]: 3.5885153944770103 0.0
[key]: 3.5885153944770103 0.0
.isel(key): 3.5885153944770103 0.0
.sel(key): 3.5885153944770103 0.0
It does appear that the underlying sparse.COO
has the correct values:
np.nansum(da_dense.data[23])
3.5885153944770103
da_sparse.data.data[da_sparse.data.coords[0] == 23].sum()
3.5885153944770103
Happy to try to delve in deeper but if anyone knows off the top of their head what the issue might be that would be very welcome 🙂
One other observation: the result isn't always 0, e.g.:
# key = 44
Total: 253.0721848728631 253.07218487286306
loc[key]: 2.868736626924726 1.1489982474345166
[key]: 2.868736626924726 1.1489982474345166
.isel(key): 2.868736626924726 1.1489982474345166
.sel(key): 2.868736626924726 1.1489982474345166