Skip to content

[SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation #14146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Jul 12, 2016

What changes were proposed in this pull request?

In code generation, it is incorrect for expressions to reuse variable names across different instances of itself. As an example, SPARK-16488 reports a bug in which pmod expression reuses variable name "r".

This patch updates ExpressionEvalHelper test harness to always project two instances of the same expression, which will help us catch variable reuse problems in expression unit tests. This patch also fixes the bug in crc32 expression.

How was this patch tested?

This is a test harness change, but I also created a new test suite for testing the test harness.

@rxin
Copy link
Contributor Author

rxin commented Jul 12, 2016

This pr will fail until #14144 makes it in.

I wonder how many unit tests would fail ...

@rxin rxin changed the title [SPARK-16489][SQL] checkEvaluation should fail if expression reuses variable names [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation Jul 12, 2016
@SparkQA
Copy link

SparkQA commented Jul 12, 2016

Test build #3183 has finished for PR 14146 at commit 0df6a99.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
    • case class BadCodegenExpression() extends LeafExpression

@SparkQA
Copy link

SparkQA commented Jul 12, 2016

Test build #3184 has finished for PR 14146 at commit 0df6a99.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
    • case class BadCodegenExpression() extends LeafExpression

@rxin
Copy link
Contributor Author

rxin commented Jul 12, 2016

cc @sameeragarwal

@SparkQA
Copy link

SparkQA commented Jul 12, 2016

Test build #62139 has finished for PR 14146 at commit 222e868.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sameeragarwal
Copy link
Member

LGTM

@rxin
Copy link
Contributor Author

rxin commented Jul 12, 2016

Merging in master/2.0.

asfgit pushed a commit that referenced this pull request Jul 12, 2016
…n code generation

In code generation, it is incorrect for expressions to reuse variable names across different instances of itself. As an example, SPARK-16488 reports a bug in which pmod expression reuses variable name "r".

This patch updates ExpressionEvalHelper test harness to always project two instances of the same expression, which will help us catch variable reuse problems in expression unit tests. This patch also fixes the bug in crc32 expression.

This is a test harness change, but I also created a new test suite for testing the test harness.

Author: Reynold Xin <rxin@databricks.com>

Closes #14146 from rxin/SPARK-16489.

(cherry picked from commit c377e49)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@asfgit asfgit closed this in c377e49 Jul 12, 2016
asfgit pushed a commit that referenced this pull request Jul 13, 2016
…n code generation

In code generation, it is incorrect for expressions to reuse variable names across different instances of itself. As an example, SPARK-16488 reports a bug in which pmod expression reuses variable name "r".

This patch updates ExpressionEvalHelper test harness to always project two instances of the same expression, which will help us catch variable reuse problems in expression unit tests. This patch also fixes the bug in crc32 expression.

This is a test harness change, but I also created a new test suite for testing the test harness.

Author: Reynold Xin <rxin@databricks.com>

Closes #14146 from rxin/SPARK-16489.

(cherry picked from commit c377e49)
Signed-off-by: Reynold Xin <rxin@databricks.com>
zzcclp pushed a commit to zzcclp/spark that referenced this pull request Jul 13, 2016
…n code generation

In code generation, it is incorrect for expressions to reuse variable names across different instances of itself. As an example, SPARK-16488 reports a bug in which pmod expression reuses variable name "r".

This patch updates ExpressionEvalHelper test harness to always project two instances of the same expression, which will help us catch variable reuse problems in expression unit tests. This patch also fixes the bug in crc32 expression.

This is a test harness change, but I also created a new test suite for testing the test harness.

Author: Reynold Xin <rxin@databricks.com>

Closes apache#14146 from rxin/SPARK-16489.

(cherry picked from commit c377e49)
Signed-off-by: Reynold Xin <rxin@databricks.com>
(cherry picked from commit 7c8a399)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants