Skip to content

[SPARK-16409] [SQL] regexp_extract with optional groups causes NPE #14504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Aug 5, 2016

What changes were proposed in this pull request?

regexp_extract actually returns null when it shouldn't when a regex matches but the requested optional group did not. This makes it return an empty string, as apparently designed.

How was this patch tested?

Additional unit test

@SparkQA
Copy link

SparkQA commented Aug 5, 2016

Test build #63258 has finished for PR 14504 at commit 75d62ae.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2016

Test build #63310 has finished for PR 14504 at commit 545c8de.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2016

Test build #63313 has finished for PR 14504 at commit b835bd3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Aug 7, 2016
## What changes were proposed in this pull request?

regexp_extract actually returns null when it shouldn't when a regex matches but the requested optional group did not. This makes it return an empty string, as apparently designed.

## How was this patch tested?

Additional unit test

Author: Sean Owen <sowen@cloudera.com>

Closes #14504 from srowen/SPARK-16409.

(cherry picked from commit 8d87252)
Signed-off-by: Sean Owen <sowen@cloudera.com>
asfgit pushed a commit that referenced this pull request Aug 7, 2016
## What changes were proposed in this pull request?

regexp_extract actually returns null when it shouldn't when a regex matches but the requested optional group did not. This makes it return an empty string, as apparently designed.

## How was this patch tested?

Additional unit test

Author: Sean Owen <sowen@cloudera.com>

Closes #14504 from srowen/SPARK-16409.

(cherry picked from commit 8d87252)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@srowen
Copy link
Member Author

srowen commented Aug 7, 2016

Merged to master/2.0/1.6

srowen added a commit to srowen/spark that referenced this pull request Aug 7, 2016
## What changes were proposed in this pull request?

regexp_extract actually returns null when it shouldn't when a regex matches but the requested optional group did not. This makes it return an empty string, as apparently designed.

## How was this patch tested?

Additional unit test

Author: Sean Owen <sowen@cloudera.com>

Closes apache#14504 from srowen/SPARK-16409.
@srowen srowen closed this Aug 7, 2016
@srowen srowen deleted the SPARK-16409 branch August 7, 2016 11:21
@ericl
Copy link
Contributor

ericl commented Aug 7, 2016

I think this broke the build: [error] /home/jenkins/workspace/spark-branch-1.6-compile-maven-with-yarn-2.4/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala:82: value toDF is not a member of Seq[String] [error] val df = Seq("aaaac").toDF("s") [error] ^ [warn] two warnings found [error] one error found

@srowen
Copy link
Member Author

srowen commented Aug 7, 2016

Yes, for 1.6: #14526

zzcclp pushed a commit to zzcclp/spark that referenced this pull request Aug 8, 2016
## What changes were proposed in this pull request?

regexp_extract actually returns null when it shouldn't when a regex matches but the requested optional group did not. This makes it return an empty string, as apparently designed.

## How was this patch tested?

Additional unit test

Author: Sean Owen <sowen@cloudera.com>

Closes apache#14504 from srowen/SPARK-16409.

(cherry picked from commit 8d87252)
Signed-off-by: Sean Owen <sowen@cloudera.com>
(cherry picked from commit 1a5e762)
@rxin
Copy link
Contributor

rxin commented Aug 8, 2016

@srowen did anybody review this?

@srowen
Copy link
Member Author

srowen commented Aug 8, 2016

I proceeded for lack of comments, and because this reproduces the Max's test case in the JIRA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants