[SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate #35286

wangyum · 2022-01-23T03:26:01Z

What changes were proposed in this pull request?

Use Aggregate.aggregateExpressions instead of Aggregate.output when pushing down limit 1 through Aggregate.

For example:

spark.range(10).selectExpr("id % 5 AS a", "id % 5 AS b").write.saveAsTable("t1")
spark.sql("SELECT a, b, a AS alias FROM t1 GROUP BY a, b LIMIT 1").explain(true)

Before this pr:

== Optimized Logical Plan ==
GlobalLimit 1
+- LocalLimit 1
   +- !Project [a#227L, b#228L, alias#226L]
      +- LocalLimit 1
         +- Relation default.t1[a#227L,b#228L] parquet

After this pr:

== Optimized Logical Plan ==
GlobalLimit 1
+- LocalLimit 1
   +- Project [a#227L, b#228L, a#227L AS alias#226L]
      +- LocalLimit 1
         +- Relation default.t1[a#227L,b#228L] parquet

Why are the changes needed?

Fix bug.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

wangyum · 2022-01-24T23:25:37Z

cc @cloud-fan

cloud-fan · 2022-01-25T01:44:29Z

thanks, merging to master!

…861) * [SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate ### What changes were proposed in this pull request? Use `Aggregate.aggregateExpressions` instead of `Aggregate.output` when pushing down limit 1 through Aggregate. For example: ```scala spark.range(10).selectExpr("id % 5 AS a", "id % 5 AS b").write.saveAsTable("t1") spark.sql("SELECT a, b, a AS alias FROM t1 GROUP BY a, b LIMIT 1").explain(true) ``` Before this pr: ``` == Optimized Logical Plan == GlobalLimit 1 +- LocalLimit 1 +- !Project [a#227L, b#228L, alias#226L] +- LocalLimit 1 +- Relation default.t1[a#227L,b#228L] parquet ``` After this pr: ``` == Optimized Logical Plan == GlobalLimit 1 +- LocalLimit 1 +- Project [a#227L, b#228L, a#227L AS alias#226L] +- LocalLimit 1 +- Relation default.t1[a#227L,b#228L] parquet ``` ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test. Closes #35286 from wangyum/SPARK-36183-2. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 9b12571)

Fix push down limit 1 through Aggregate with alias

9506226

github-actions bot added the SQL label Jan 23, 2022

cloud-fan approved these changes Jan 25, 2022

View reviewed changes

cloud-fan closed this in 9b12571 Jan 25, 2022

wangyum deleted the SPARK-36183-2 branch January 25, 2022 02:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate #35286

[SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate #35286

Uh oh!

wangyum commented Jan 23, 2022 •

edited

Loading

Uh oh!

wangyum commented Jan 24, 2022

Uh oh!

cloud-fan commented Jan 25, 2022

Uh oh!

Uh oh!

[SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate #35286

[SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate #35286

Uh oh!

Conversation

wangyum commented Jan 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wangyum commented Jan 24, 2022

Uh oh!

cloud-fan commented Jan 25, 2022

Uh oh!

Uh oh!

wangyum commented Jan 23, 2022 •

edited

Loading