Skip to content

[SPARK-46741][SQL] Cache Table with CTE won't work #44767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from

Conversation

AngersZhuuuu
Copy link
Contributor

What changes were proposed in this pull request?

Cache Table with CTE won't work, there are two reasons

  1. In the current code CTE in CacheTableAsSelect will be inlined
  2. CTERelation Ref and Def didn't handle the CTEId doCanonicalize issue

Cause the current case can't be matched.

Why are the changes needed?

Fix bug

Does this PR introduce any user-facing change?

Yea, Cache table with CTE can work after this pr

How was this patch tested?

Added UT

Was this patch authored or co-authored using generative AI tooling?

No

@AngersZhuuuu
Copy link
Contributor Author

ping @cloud-fan

@github-actions github-actions bot added the SQL label Jan 17, 2024
@AngersZhuuuu AngersZhuuuu changed the title [SPARK-46741][SQL] Cache Table with CET won't work [SPARK-46741][SQL] Cache Table with CTE won't work Jan 17, 2024
@AngersZhuuuu
Copy link
Contributor Author

How about current? @cloud-fan

_.containsAnyPattern(CTE, PLAN_EXPRESSION)) {
case ref: CTERelationRef =>
ref.copy(cteId = defIndex(ref.cteId).toLong)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure there is no nested WithCTE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure there is no nested WithCTE?

Form the CTESubtitution's code, and it's pr, there won't have nested WithCTE. #37751 cc @maryannxue

Copy link
Contributor

@cloud-fan cloud-fan Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add assert to guarantee this assumption?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Found nested WithCTE case, added to cte.sql and here change the method to support nested CTE, pls take a look again.

@AngersZhuuuu
Copy link
Contributor Author

ping @cloud-fan @yaooqinn

@AngersZhuuuu
Copy link
Contributor Author

@HyukjinKwon
Copy link
Member

what was behaviour before? Would be great to show the result before/after

@AngersZhuuuu
Copy link
Contributor Author

what was behaviour before? Would be great to show the result before/after

For the query in cache.sql

EXPLAIN EXTENDED SELECT * FROM cache_nested_cte_table

before this pr, cached table cache_nested_cte_table won't match, will execute again, after this pr, it can match the InMemoryRelation

Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Oct 12, 2024
@github-actions github-actions bot closed this Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants