generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 181
Open
Labels
PPLPiped processing languagePiped processing language
Description
Summary
Implement an optimizer rule equivalent to Spark SQL’s “UnwrapCastInBinaryComparison” to remove redundant CAST/SAFE_CAST on the column side of binary comparisons. This allows predicates to be pushed down as normal FILTERs instead of SCRIPTs.
Motivation / Problem
During Calcite validation, mismatched types can introduce SAFE_CAST on the attribute side (e.g., SMALLINT column vs INTEGER literal). This causes SCRIPT-based pushdown and prevents efficient filter pushdown.
Concrete example
- Work item surfaced in PR Use Calcite's validation system for type checking & coercion #4892.
- ClickBench q2: AdvEngineID (SMALLINT) compared to 0 (INTEGER) ends up as SAFE_CAST(AdvEngineID) <> 0 which becomes a SCRIPT in pushdown rather than a FILTER.
Desired behavior (high-level)
- When a binary comparison has CAST/SAFE_CAST on the column side and a foldable literal on the other side, unwrap the cast from the column and, if needed, cast the literal to the column’s type.
- Preserve null semantics appropriately (e.g., add isNotNull(col) where required by the comparison semantics).
- Result: comparisons like SAFE_CAST(col) <> 0 become col <> SMALLINT(0), enabling FILTER pushdown (no script).
Scope (initial)
- Operators: =, <>, <, <=, >, >=, BETWEEN, IN (scalar/tuple), and relevant IP variants where applicable.
- Only when one side is CAST/SAFE_CAST of an attribute (or a simple expression derived from it) and the other side is a literal/foldable.
- Do not change behavior in cases that would alter correctness (e.g., overflow/precision edge cases without a safe equivalent).
References
- Spark optimizer rule: UnwrapCastInBinaryComparison
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala
Backlinks
- PR: Use Calcite's validation system for type checking & coercion #4892
- Comment: Use Calcite's validation system for type checking & coercion #4892 (comment)
Acceptance criteria
- Predicates that currently generate SCRIPT pushdown solely due to CAST/SAFE_CAST on the column side are rewritten to enable FILTER pushdown.
- No regressions in correctness; add integration coverage (e.g., ClickBench q2) demonstrating the switch from SCRIPT to FILTER pushdown.
- Document the rule and any corner cases (null handling, overflow behavior) in developer docs.
Requested by
@LantaoJin
Metadata
Metadata
Assignees
Labels
PPLPiped processing languagePiped processing language
Type
Projects
Status
Not Started