Description
I am currently sad about the order of our optimizations, e.g how simplify, lower, combine similar and others are executed after another.
I am looking into different approaches how we can change the expression tree when we have reordered things as we want (e.g. currently after combine_similar). Adding the IO Fusion brought a nice performance improvement.
The step currently happens after combine_similar, e.g. also after lower, which is problematic because changing things like the number of partitions can have impact on other decision we would like to make, e.g. the merge algorithm if we can reduce to 1 partition for example or use a broadcast join if we can reduce the partitions on one side. Lower locks these algorithms In before we can touch the tree though.
I want to move lower to a later stage in our optimization approach, e.g. after combine_similar and also after we modify the expression as we see fit. There is no real downside as far as I can see, the only difference is that we can't run simplify again after lower happens, but I don't think that we have a use-case for this at the moment and I also can't see one coming up anytime soon.