Description
After @shoyer mentioned earlier today that he had an example of dealing with with the ND-histogram problem in xarray by using xarray.apply_ufunc
and numpy_groupies, I made this notebook to try it out for creating histograms in xarray.
The basic idea is that da.groupby_bins(bins).apply(count)
essentially creates a histogram, and numpy_groupies
can speed up the groupby_bins
hugely.
I think its pretty cool that it even works, but you'll see in the notebook that I don't think the performance compares favourably with xhistogram's dask.blockwise
implementation (see #49), though I didn't manage to get numba-powered groupies working yet. The dask task graphs are also not as nice.
@rabernat this is the sort of thing I had in mind originally.
@gjoseph92 you might find this interesting as an alternate solution to your blockwise
one.
@dcherian and @max-sixty you might find this example interesting as I know you've been working on using numpy_groupies
in pydata/xarray#4473 .