Lazy arrays for asymptotically better performance

I've already brought this up in another thread (dask/dask#5879), but I was planning to make sparse collections within this library lazy for asymptotically better performance in certain situations. See the following research papers for details:

* https://dl.acm.org/doi/10.1145/3133901
* https://arxiv.org/abs/1802.10574
* https://arxiv.org/abs/1804.10112
* https://arxiv.org/abs/2001.00532
* https://arxiv.org/abs/2001.02609

And the following talks:

* https://www.youtube.com/watch?v=0OP8WjFyU-Q
* https://www.youtube.com/watch?v=sQOq3Ci4tB0
* https://www.youtube.com/watch?v=yAtG64qV2nM

These research papers define a method to generate efficient kernels for a broad range of storage formats. They can do things composed of element-wise operations (with broadcasting) and reductions, but they can't do things like (for example) eigendecompositions (which we intend to do with SciPy wrappers for LAPACK, et. al.).

With this in mind, would it make sense to make sparse collections lazy, with the caveat of an API break? These would have an API similar to [Dask](https://dask.org/), having to do `arr.compute()` for the final result. As discussed in dask/dask#5879, it would also follow the [protocols for dask custom collections](https://docs.dask.org/en/latest/custom-collections.html).

If we manage to do this right, adding GPU support shouldn't be difficult either. But the question arises, is it worth the break an API compatibility to do this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Lazy arrays for asymptotically better performance #326

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Lazy arrays for asymptotically better performance #326

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions