-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Labels
Description
Trackers
- enable trait-based partition optimization for more accurate leaf-stage direct exchange/shuffle ([multistage][feature] support RelDistribution trait planning #11976)
- extend current direct exchange/shuffle logic to the entire query plan [multistage]partition assignment refactor #12079
- revisit long-term distribution ([multistage][feature] revisit RelDistribution detection logic #12012)
Details
Trait-based optimziation
Goals of the first enablement of trait-based optimization:
- ensure traits are propagated properly across the rel-tree
- ensure the leaf-stage optimization for direct exchange is correct --> previously it was blindly done via tableOptions
- ensure that no regression occurs for existing, correct direct exchange
Extended partition optimization beyond leaf-stage
in additional to dealing with RelDistribution trait we also need to deal with physical information passed in via hints, specifically
-
colocated join hint: ensure colocated join hints are enforced
- when colocated hint exist and table are not colocated --> issue warning or fail with excpeiton
- when no colocated join hints are passed, colocation is best-effort, unless explicitly disabled via hint
-
partition key and functions: data can be partitioned by the same key but not the same function
- ensure partition/colocate are checked against both function and name
-
allow auto TableOptions
- table partition key, size, function can be read from tableConfigs;
- with RelDistribution and partition/colocated hint, table Options are no longer needed inside query, but it is also used as a hint to "assigning worker/server based on table partition" which should be preserved
Execute long-term optimization
Several goals are listed down but can be discussed further
- stage parallelism
- direct exchanges are not currently tested against stage parallelism, they are all 1-1
- should support per-partition fan-out
- relDistribution trait insertion before exchange rules
- relDistribution trait can be walked before exchange thus exchange insertion can be done when expected and actual trait differs between input node result relDistribution and current node expected result relDistribution
- best relDistribution preservation
- partition compatibililty optimization, e.g.
- whether to partition by [a, b] combine or partition by [a] when doing a GROUP BY a, b --> the former is trivial selection; but latter could be beneficial if downstream stages are joining on [a]
- push repartition later --> e.g. GROUP BY a, b can be done on a pre-partitioned-by-a stage until a repartition is absolutely necessary.
- partition compatibililty optimization, e.g.
See more discussion in #12012 for detail impl plan
Reactions are currently unavailable