-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-53733][SQL] Delay resolveColsLastResort
until all previous UnresolvedAlias
es are resolved
#52471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[SPARK-53733][SQL] Delay resolveColsLastResort
until all previous UnresolvedAlias
es are resolved
#52471
Conversation
174d1f1
to
a696d0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after expanding tests a bit.
sql/core/src/test/resources/sql-tests/inputs/column-last-resort-resolution-precedence.sql
Show resolved
Hide resolved
a696d0d
to
ff85dcd
Compare
SELECT 'a' AS a, a; | ||
|
||
SELECT 'a', a FROM VALUES(1) AS t(a); | ||
SELECT 'a' AS a, a FROM VALUES(1) AS t(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use a view instead of VALUES
? The lines would get shorter.
SELECT 'a' AS a, a; | ||
|
||
SELECT 'a', a FROM VALUES(1) AS t(a); | ||
SELECT 'a' AS a, a FROM VALUES(1) AS t(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think arguments for CREATE FUNCTION
might also be affected - they are resolved after LCAs and before session variables
SELECT 'a', a FROM VALUES(1) AS t(a); | ||
SELECT 'a' AS a, a FROM VALUES(1) AS t(a); | ||
|
||
SELECT col1 FROM VALUES(1) WHERE EXISTS (SELECT 'col1', col1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the logic for Project
, but the logic for Aggregate
is either missing or non-obvious.
We need the same tests with with GROUP BY
.
What changes were proposed in this pull request?
Delay
resolveColsLastResort
until allUnresolvedAlias
es that come before the column that is being resolved, are resolvedWhy are the changes needed?
For the follwing query:
Spark incorrectly resolves the second
a
column to the variable instead of resolving it as a lateral column alias reference to the implicit alias of literal 'a'. This is not consistent with the current intended behavior and name resolution precedence in Spark:Similarly, the fix applies to precedence of LCAs over outer references as in this query:
Does this PR introduce any user-facing change?
Yes, user now sees the correct result
How was this patch tested?
Added golden file tests for affected queries.
Was this patch authored or co-authored using generative AI tooling?
No