Skip to content

Optimizer: Unwrap CAST/SAFE_CAST in binary comparisons to enable filter pushdown (Spark parity) #5083

@coderabbitai

Description

@coderabbitai

Summary
Implement an optimizer rule equivalent to Spark SQL’s “UnwrapCastInBinaryComparison” to remove redundant CAST/SAFE_CAST on the column side of binary comparisons. This allows predicates to be pushed down as normal FILTERs instead of SCRIPTs.

Motivation / Problem
During Calcite validation, mismatched types can introduce SAFE_CAST on the attribute side (e.g., SMALLINT column vs INTEGER literal). This causes SCRIPT-based pushdown and prevents efficient filter pushdown.

Concrete example

Desired behavior (high-level)

  • When a binary comparison has CAST/SAFE_CAST on the column side and a foldable literal on the other side, unwrap the cast from the column and, if needed, cast the literal to the column’s type.
  • Preserve null semantics appropriately (e.g., add isNotNull(col) where required by the comparison semantics).
  • Result: comparisons like SAFE_CAST(col) <> 0 become col <> SMALLINT(0), enabling FILTER pushdown (no script).

Scope (initial)

  • Operators: =, <>, <, <=, >, >=, BETWEEN, IN (scalar/tuple), and relevant IP variants where applicable.
  • Only when one side is CAST/SAFE_CAST of an attribute (or a simple expression derived from it) and the other side is a literal/foldable.
  • Do not change behavior in cases that would alter correctness (e.g., overflow/precision edge cases without a safe equivalent).

References

Backlinks

Acceptance criteria

  • Predicates that currently generate SCRIPT pushdown solely due to CAST/SAFE_CAST on the column side are rewritten to enable FILTER pushdown.
  • No regressions in correctness; add integration coverage (e.g., ClickBench q2) demonstrating the switch from SCRIPT to FILTER pushdown.
  • Document the rule and any corner cases (null handling, overflow behavior) in developer docs.

Requested by
@LantaoJin

Metadata

Metadata

Assignees

Labels

PPLPiped processing language

Type

No type

Projects

Status

Not Started

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions