Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32577][SQL][TEST] Fix the config value for shuffled hash join in test in-joins.sql #33236

Closed
wants to merge 1 commit into from

Conversation

c21
Copy link
Contributor

@c21 c21 commented Jul 6, 2021

What changes were proposed in this pull request?

We found the in-join.sql does not test shuffled hash join properly in https://issues.apache.org/jira/browse/SPARK-32577, but didn't find a good way to fix it. Given we now have a test config to enforce shuffled hash join in #33182, we can fix the test here now as well.

Why are the changes needed?

Fix test to have better test coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Reran the test to compare the output, and verified the query plan manually to make sure shuffled hash join being used.

@github-actions github-actions bot added the SQL label Jul 6, 2021
@c21
Copy link
Contributor Author

c21 commented Jul 6, 2021

cc @cloud-fan, thanks.

@SparkQA
Copy link

SparkQA commented Jul 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45231/

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45231/

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Test build #140720 has finished for PR 33236 at commit 1ceff15.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

are there more places that need to fix?

@c21
Copy link
Contributor Author

c21 commented Jul 7, 2021

It seems not searchable in repo, let me try to figure out all places tomorrow. Thanks.

@HyukjinKwon
Copy link
Member

let me just merge it in then.

HyukjinKwon pushed a commit that referenced this pull request Jul 7, 2021
…in test in-joins.sql

### What changes were proposed in this pull request?

We found the `in-join.sql` does not test shuffled hash join properly in https://issues.apache.org/jira/browse/SPARK-32577, but didn't find a good way to fix it. Given we now have a test config to enforce shuffled hash join in #33182, we can fix the test here now as well.

### Why are the changes needed?

Fix test to have better test coverage.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Reran the test to compare the output, and verified the query plan manually to make sure shuffled hash join being used.

Closes #33236 from c21/join-test.

Authored-by: Cheng Su <chengsu@fb.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit f3c1159)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@HyukjinKwon
Copy link
Member

Merged to master and branch-3.2.

@c21 c21 deleted the join-test branch July 7, 2021 22:13
@c21
Copy link
Contributor Author

c21 commented Jul 7, 2021

Thank you @HyukjinKwon and @cloud-fan for review! Will submit another PR for other changes.

HyukjinKwon pushed a commit that referenced this pull request Jul 8, 2021
…ash join for all other test queries

### What changes were proposed in this pull request?

This is the followup from #33236 (comment), where we are fixing the config value of shuffled hash join, for all other test queries. Found all configs by searching in https://github.com/apache/spark/search?q=spark.sql.join.preferSortMergeJoin .

### Why are the changes needed?

Fix test to have better test coverage.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #33249 from c21/join-test.

Authored-by: Cheng Su <chengsu@fb.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Jul 8, 2021
…ash join for all other test queries

### What changes were proposed in this pull request?

This is the followup from #33236 (comment), where we are fixing the config value of shuffled hash join, for all other test queries. Found all configs by searching in https://github.com/apache/spark/search?q=spark.sql.join.preferSortMergeJoin .

### Why are the changes needed?

Fix test to have better test coverage.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #33249 from c21/join-test.

Authored-by: Cheng Su <chengsu@fb.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 23943e5)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants