-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-47511][SQL] Canonicalize With expressions by re-assigning IDs #45649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-47511][SQL] Canonicalize With expressions by re-assigning IDs #45649
Conversation
@cloud-fan @bersprockets can you help review this PR? |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/With.scala
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
case class CommonExpressionId(id: Long = CommonExpressionId.newId, canonicalized: Boolean = false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In QueryPlan
we have this
/**
* A private mutable variable to indicate whether this plan is the result of canonicalization.
* This is used solely for making sure we wouldn't execute a canonicalized plan.
* See [[canonicalized]] on how this is set.
*/
@transient private var _isCanonicalizedPlan: Boolean = false
protected def isCanonicalizedPlan: Boolean = _isCanonicalizedPlan
Shall we do the same in With
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what replacement are you suggesting? the canonicalized
parameter in CommonExpressionId
is used to distinguish between an ID that was assigned initially by curId.getAndIncrement()
or newly assigned by canonicalization. it is more of a property of the ID itself than the With
/CommonExpressionDef
/CommonExpressionRef
operators? also, if we haven't called .canonicalized
on the outermost With
, it is possible to have a With
expression that contains some canonicalized IDs but some non-canonicalized IDs
…essions/With.scala Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
thanks, merging to master! |
### What changes were proposed in this pull request? This PR changes the canonicalization of `With` expressions to re-assign IDs starting from 0. ### Why are the changes needed? This ensures that the canonicalization of `With` expressions is consistent. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#45649 from kelvinjian-db/SPARK-47511-with-canonicalization. Authored-by: Kelvin Jiang <kelvin.jiang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…ITH_EXPR to be more general ### What changes were proposed in this pull request? This is a follow-up of - #45649 `With` is not only used by `NullIf`, but also `Between`. This PR renames the config `REPLACE_NULLIF_USING_WITH_EXPR` to be more general, and use it to control `Between` as well. ### Why are the changes needed? have a conf to control all the usages of `With`. ### Does this PR introduce _any_ user-facing change? no, not released yet ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #45871 from cloud-fan/with-conf. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to add `configName` Scalastyle rule to prevent invalid config names. ### Why are the changes needed? To prevent repetitive mistake pattern - #45649 - #48149 - #49121 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Currently, this PR will fail at Scalastyle test because the `master` branch is broken. We can merge this after the following PR. - #49897 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49900 from dongjoon-hyun/SPARK-51173. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This PR aims to add `configName` Scalastyle rule to prevent invalid config names. ### Why are the changes needed? To prevent repetitive mistake pattern - #45649 - #48149 - #49121 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Currently, this PR will fail at Scalastyle test because the `master` branch is broken. We can merge this after the following PR. - #49897 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49900 from dongjoon-hyun/SPARK-51173. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 274dc5e) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR aims to add `configName` Scalastyle rule to prevent invalid config names. To prevent repetitive mistake pattern - #45649 - #48149 - #49121 No. Currently, this PR will fail at Scalastyle test because the `master` branch is broken. We can merge this after the following PR. - #49897 No. Closes #49900 from dongjoon-hyun/SPARK-51173. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 274dc5e) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This PR changes the canonicalization of
With
expressions to re-assign IDs starting from 0.Why are the changes needed?
This ensures that the canonicalization of
With
expressions is consistent.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added unit tests.
Was this patch authored or co-authored using generative AI tooling?
No.