Skip to content

[SPARK-39437][SQL][TEST] Normalize plan id separately in PlanStabilitySuite #36827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

In PlanStabilitySuite, we normalize expression IDs by matching #\d+ in the explain string. However, this regex can match plan id in Exchange node as well, which will mess up the normalization if expression IDs and plan IDs overlap.

This PR normalizes plan id separately in PlanStabilitySuite.

Why are the changes needed?

Make the plan golden file more stable.

Does this PR introduce any user-facing change?

no

How was this patch tested?

N/A

@github-actions github-actions bot added the SQL label Jun 10, 2022
@cloud-fan
Copy link
Contributor Author

cc @dongjoon-hyun

@@ -37,7 +37,7 @@ abstract class Exchange extends UnaryExecNode {
override def output: Seq[Attribute] = child.output
final override val nodePatterns: Seq[TreePattern] = Seq(EXCHANGE)

override def stringArgs: Iterator[Any] = super.stringArgs ++ Iterator(s"[id=#$id]")
override def stringArgs: Iterator[Any] = super.stringArgs ++ Iterator(s"[plan_id=$id]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much!

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Yes, I expected this kind of massive PR. :)

@dongjoon-hyun
Copy link
Member

Could you fix more test failures?

24881 tests run, 682 skipped, 7 failed.

@cloud-fan
Copy link
Contributor Author

More golden files need to be re-generated. Working on it now.

cloud-fan added a commit that referenced this pull request Jun 10, 2022
…bilitySuite

backport #36827

### What changes were proposed in this pull request?

In `PlanStabilitySuite`, we normalize expression IDs by matching `#\d+` in the explain string. However, this regex can match plan id in `Exchange` node as well, which will mess up the normalization if expression IDs and plan IDs overlap.

This PR normalizes plan id separately in `PlanStabilitySuite`.

### Why are the changes needed?

Make the plan golden file more stable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

Closes #36828 from cloud-fan/test2.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@dongjoon-hyun
Copy link
Member

Merged to master. Thank you, @cloud-fan and @HyukjinKwon .

@dongjoon-hyun
Copy link
Member

Oh, q83 becomes flaky in GitHub Action environment on master branch.

[info] - check simplified (tpcds-v1.4/q83) *** FAILED *** (182 milliseconds)
...
[info] - check simplified sf100 (tpcds-v1.4/q83) *** FAILED *** (183 milliseconds)
...
[info] *** 2 TESTS FAILED ***
[error] Failed: Total 1721, Failed 2, Errors 0, Passed 1719, Ignored 2
[error] Failed tests:
[error] 	org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite
[error] 	org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite

I verified that it's stable locally.

$ SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *PlanStabilitySuite"
$ git status
On branch master
Your branch is up to date with 'apache/master'.

nothing to commit, working tree clean

$ SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *PlanStabilitySuite -- -z (tpcds-v1.4/q83)"
$ git status
On branch master
Your branch is up to date with 'apache/master'.

nothing to commit, working tree clean

@dongjoon-hyun
Copy link
Member

Ah, I found that it is ANSI mode test. Let me make a follow-up.

dongjoon-hyun added a commit that referenced this pull request Jun 10, 2022
### What changes were proposed in this pull request?

This is a follow-up of #36827 to update `q83.ansi` result.

### Why are the changes needed?

ANSI mode unit tests are not covered on GitHub Action on PRs.
- https://github.com/apache/spark/commits/master

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This should be tested manually because `GitHub Action` doesn't test this.

```
$ SPARK_ANSI_SQL_MODE=true build/sbt "sql/testOnly *PlanStabilitySuite"
```

Closes #36836 from dongjoon-hyun/SPARK-39437.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 10, 2022
### What changes were proposed in this pull request?

This is a follow-up of #36827 to update `q83.sf100.ansi` result.

### Why are the changes needed?

ANSI mode unit tests are not covered on GitHub Action on PRs.
- https://github.com/apache/spark/commits/master

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This should be tested manually because `GitHub Action` doesn't test this.

```
$ SPARK_ANSI_SQL_MODE=true build/sbt "sql/testOnly *PlanStabilityWithStatsSuite"
```

Closes #36838 from dongjoon-hyun/SPARK-39437-2.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
cloud-fan added a commit to cloud-fan/spark that referenced this pull request Jun 13, 2022
…ySuite

In `PlanStabilitySuite`, we normalize expression IDs by matching `#\d+` in the explain string. However, this regex can match plan id in `Exchange` node as well, which will mess up the normalization if expression IDs and plan IDs overlap.

This PR normalizes plan id separately in `PlanStabilitySuite`.

Make the plan golden file more stable.

no

N/A

Closes apache#36827 from cloud-fan/test.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Jun 13, 2022
…bilitySuite

### What changes were proposed in this pull request?

In `PlanStabilitySuite`, we normalize expression IDs by matching `#\d+` in the explain string. However, this regex can match plan id in `Exchange` node as well, which will mess up the normalization if expression IDs and plan IDs overlap.

This PR normalizes plan id separately in `PlanStabilitySuite`.

### Why are the changes needed?

Make the plan golden file more stable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

backport #36827

Closes #36854 from cloud-fan/test2.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
…bilitySuite

backport apache#36827

### What changes were proposed in this pull request?

In `PlanStabilitySuite`, we normalize expression IDs by matching `#\d+` in the explain string. However, this regex can match plan id in `Exchange` node as well, which will mess up the normalization if expression IDs and plan IDs overlap.

This PR normalizes plan id separately in `PlanStabilitySuite`.

### Why are the changes needed?

Make the plan golden file more stable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

Closes apache#36828 from cloud-fan/test2.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants