Description
There are a couple of possibilities for doing query optimization that have come up recently.
Dask-expr will support arrays soon (dask/dask-expr#446). It would be interesting to see if the expression system can be used in Cubed, and if there are any changes we'd need to contribute back.
egglog
"is a Python package that provides bindings to the Rust library egglog, allowing you to use e-graphs in Python for optimization". Interestingly, it has a prototype of the Array API, which might make it a good candidate for providing query optimization for Cubed. This tutorial has an example of using the Array API implementation to optimize a scikit-learn function. (@saulshanabrook told us about egglog at yesterday's Pangeo Distributed Computing Working Group.)