Skip to content

Conversation

@mihailotim-db
Copy link
Contributor

@mihailotim-db mihailotim-db commented Jul 1, 2025

What changes were proposed in this pull request?

Trim aliases before matching Sort/Having/Filter expressions with semantically equal expression from the Aggregate below in buildAggExprList

Why are the changes needed?

For a query like:

SELECT course, year, GROUPING(course) FROM courseSales GROUP BY CUBE(course, year) ORDER BY GROUPING(course)

Plan after ResolveReferences and before ResolveAggregateFunctions looks like:

!Sort [cast((shiftright(tempresolvedcolumn(spark_grouping_id#18L, spark_grouping_id, false), 1) & 1) as tinyint) AS grouping(course)#22 ASC NULLS FIRST], true                                                                                                                                                         
 +- Aggregate [course#19, year#20, spark_grouping_id#18L], [course#19, year#20, cast((shiftright(spark_grouping_id#18L, 1) & 1) as tinyint) AS grouping(course)#21 AS grouping(course)#15]                                                                                                                             
....

Because aggregate list has Alias(Alias(cast((shiftright(spark_grouping_id#18L, 1) & 1) as tinyint)) expression from SortOrder won't get matched as semantically equal and it will result in adding an unnecessary Project. By stripping inner aliases from aggregate list (that are going to get removed anyways in CleanupAliases) we can match SortOrder expression and resolve it as grouping(course)#15

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Jul 1, 2025
@mihailotim-db mihailotim-db force-pushed the mihailotim-db/fix_inner_aliases_semi_structured branch 2 times, most recently from bcd4f44 to f5504ef Compare July 2, 2025 00:43
@mihailotim-db mihailotim-db changed the title fix [SPARK-52649][SQL] Trim aliases before matching Sort/Having/Filter expressions in buildAggExprList Jul 2, 2025
@mihailotim-db mihailotim-db force-pushed the mihailotim-db/fix_inner_aliases_semi_structured branch 4 times, most recently from 3b2f3b5 to b791c47 Compare July 2, 2025 05:34
@mihailotim-db mihailotim-db force-pushed the mihailotim-db/fix_inner_aliases_semi_structured branch from b791c47 to 5ac3f1d Compare July 2, 2025 06:57
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 17bd772 Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants