-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Resolve UpCast expressions introduced in Delta DMLs #1998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
@@ -107,6 +107,11 @@ class DeltaSparkSessionExtension extends (SparkSessionExtensions => Unit) { | |||
extensions.injectPostHocResolutionRule { session => | |||
new PreprocessTableDelete(session.sessionState.conf) | |||
} | |||
// Resolve new UpCast expressions that might have been introduced by [[PreprocessTableUpdate]] | |||
// and [[PreprocessTableMerge]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johanl-db, in DeltaMergeBuilder we explicitly call PostHocResolveUpCast. The inject Rule only runs automatically for UPDATES and not for MERGE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MERGE using the Scala Delta table API doesn't go through the analyzer but instead resolves the merge command manually. UPDATE (SQL + Scala/Java/Python) and MERGE (SQL) do run through the regular path and injected rules will be applied.
I'm not very familiar with why we need to resolve the MERGE command manually when using the Scala API, this may be something we can improve in the future
Follow-up from delta-io#1998 to run the rule `PostHocResolveUpCast` multiple times until all `UpCast` expressions are resolved. With the previous change, it only ran once, failing to resolve nested UpCast expressions. Added a test covering a problematic case with multiple `UpCast` expressions. GitOrigin-RevId: de56850c6013ad2b7c4a134917778069ee2f5915
Follow-up from delta-io#1998 to run the rule `PostHocResolveUpCast` multiple times until all `UpCast` expressions are resolved. With the previous change, it only ran once, failing to resolve nested UpCast expressions. Added a test covering a problematic case with multiple `UpCast` expressions. GitOrigin-RevId: de56850c6013ad2b7c4a134917778069ee2f5915
Which Delta project/connector is this regarding?
Description
#1938 changed the casting behavior in MERGE and UPDATE to follow the value of the
storeAssignmentPolicy
config instead of theansiEnabled
one, making the behavior consistent with INSERT.This change breaks MERGE and UPDATE operations that contain a cast when
storeAssignmentPolicy
is set toSTRICT
, throwing an internal error during analysis.The cause is the
UpCast
expression added byPreprocessTableMerge
andPreprocessTableUpdate
when processing assignments.UpCast
is meant to be replaced by a regular cast after performing checks by theResolveUpCast
rule that runs during the resolution phase beforePreprocessTableMerge
andPreprocessTableUpdate
introduce the expression, leaving the cast unresolved.The fix is to run the
ResolveUpCast
rule once more afterPreprocessTableMerge
andPreprocessTableUpdate
have run.How was this patch tested?
Missing tests covering cast behavior for the different values of
storeAssignmentPolicy
for UPDATE and MERGE are added, covering: