-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32237][SQL][3.0] Resolve hint in CTE #29201
Conversation
Test build #126399 has finished for PR 29201 at commit
|
retest this please |
Test build #126410 has finished for PR 29201 at commit
|
Could you resolve conflicts, @LantaoJin ? |
Also, cc @HyukjinKwon for |
@dongjoon-hyun, I will backport #29117. Thanks for letting me know. |
Thank you for resolving conflicts, @LantaoJin . |
thanks, merging to 3.0! |
### What changes were proposed in this pull request? The backport of #29062 This PR is to move `Substitution` rule before `Hints` rule in `Analyzer` to avoid hint in CTE not working. ### Why are the changes needed? Below SQL in Spark3.0 will throw AnalysisException, but it works in Spark2.x ```sql WITH cte AS (SELECT /*+ REPARTITION(3) */ T.id, T.data FROM $t1 T) SELECT cte.id, cte.data FROM cte ``` ``` Failed to analyze query: org.apache.spark.sql.AnalysisException: cannot resolve '`cte.id`' given input columns: [cte.data, cte.id]; line 3 pos 7; 'Project ['cte.id, 'cte.data] +- SubqueryAlias cte +- Project [id#21L, data#22] +- SubqueryAlias T +- SubqueryAlias testcat.ns1.ns2.tbl +- RelationV2[id#21L, data#22] testcat.ns1.ns2.tbl 'Project ['cte.id, 'cte.data] +- SubqueryAlias cte +- Project [id#21L, data#22] +- SubqueryAlias T +- SubqueryAlias testcat.ns1.ns2.tbl +- RelationV2[id#21L, data#22] testcat.ns1.ns2.tbl ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Add a unit test Closes #29201 from LantaoJin/SPARK-32237_branch-3.0. Lead-authored-by: LantaoJin <jinlantao@gmail.com> Co-authored-by: Alan Jin <jinlantao@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Test FAILed. |
…g in Jenkins ### What changes were proposed in this pull request? This PR backports #29117 to branch-3.0 as the flakiness was found in branch-3.0 too: #29201 (comment) and #29201 (comment) This PR proposes: - ~~Don't use `--user` in pip packaging test~~ - ~~Pull `source` out of the subshell, and place it first.~~ - Exclude user sitepackages in Python path during pip installation test to address the flakiness of the pip packaging test in Jenkins. ~~(I think) #29116 caused this flakiness given my observation in the Jenkins log. I had to work around by specifying `--user` but it turned out that it does not properly work in old Conda on Jenkins for some reasons. Therefore, reverting this change back.~~ (I think) the installation at user site-packages affects other environments created by Conda in the old Conda version that Jenkins has. Seems it fails to isolate the environments for some reasons. So, it excludes user sitepackages in the Python path during the test. ~~In addition, #29116 also added some fallback logics of `conda (de)activate` and `source (de)activate` because Conda prefers to use `conda (de)activate` now per the official documentation and `source (de)activate` doesn't work for some reasons in certain environments (see also conda/conda#7980). The problem was that `source` loads things to the current shell so does not affect the current shell. Therefore, this PR pulls `source` out of the subshell.~~ Disclaimer: I made the analysis purely based on Jenkins machine's log in this PR. It may have a different reason I missed during my observation. ### Why are the changes needed? To make the build and tests pass in Jenkins. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Jenkins tests should test it out. Closes #29215 from HyukjinKwon/SPARK-32363-3.0. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
The backport of #29062
This PR is to move
Substitution
rule beforeHints
rule inAnalyzer
to avoid hint in CTE not working.Why are the changes needed?
Below SQL in Spark3.0 will throw AnalysisException, but it works in Spark2.x
Does this PR introduce any user-facing change?
No
How was this patch tested?
Add a unit test