Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.4][SPARK-32167][SQL] Fix GetArrayStructFields to respect inner field's nullability together #29019

Closed
wants to merge 1 commit into from

Conversation

cloud-fan
Copy link
Contributor

@cloud-fan cloud-fan commented Jul 7, 2020

What changes were proposed in this pull request?

Backport #28992 to 2.4

Fix nullability of GetArrayStructFields. It should consider both the original array's containsNull and the inner field's nullability.

Why are the changes needed?

Fix a correctness issue.

Does this PR introduce any user-facing change?

Yes. See the added test.

How was this patch tested?

a new UT and end-to-end test

@cloud-fan
Copy link
Contributor Author

cc @dongjoon-hyun @viirya

@SparkQA
Copy link

SparkQA commented Jul 7, 2020

Test build #125183 has finished for PR 29019 at commit 1ca65b4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 7, 2020

Test build #125191 has finished for PR 29019 at commit 1ca65b4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 7, 2020

Test build #125207 has finished for PR 29019 at commit 1ca65b4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @cloud-fan .
+1, LGTM (Pending Jenkins).

@SparkQA
Copy link

SparkQA commented Jul 7, 2020

Test build #125223 has finished for PR 29019 at commit 1ca65b4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Ur, @cloud-fan . It seems that there is a missing part at SelectedFieldSuite.scala.

[info] - SELECT "col3.field1.subfield1" should select the schema
[info]    root
[info]     |-- col3: array (nullable = false)
[info]     |    |-- element: struct (containsNull = true)
[info]     |    |    |-- field1: struct (nullable = true)
[info]     |    |    |    |-- subfield1: integer (nullable = false)
[info]  *** FAILED *** (32 milliseconds)
[info]   Expected StructField(col3,ArrayType(StructType(StructField(field1,StructType(StructField(subfield1,IntegerType,false)),true)),true),false), but got StructField(col3,ArrayType(StructType(StructField(field1,StructType(StructField(subfield1,IntegerType,false)),true)),false),false) (SelectedFieldSuite.scala:400)

@dongjoon-hyun
Copy link
Member

Gentle ping, @cloud-fan ~

…nullability together

Fix nullability of `GetArrayStructFields`. It should consider both the original array's `containsNull` and the inner field's nullability.

Fix a correctness issue.

Yes. See the added test.

a new UT and end-to-end test

Closes apache#28992 from cloud-fan/bug.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 5d296ed)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125306 has finished for PR 29019 at commit d22707a.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125321 has finished for PR 29019 at commit d22707a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125354 has finished for PR 29019 at commit d22707a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125366 has finished for PR 29019 at commit d22707a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@dongjoon-hyun
Copy link
Member

Merged to branch-2.4. Thank you, @cloud-fan and all.
All UTs (including R) passed. Currently, R installation test (which is irrelevant to this PR) is running.

dongjoon-hyun pushed a commit that referenced this pull request Jul 8, 2020
…ld's nullability together

### What changes were proposed in this pull request?

Backport #28992 to 2.4

Fix nullability of `GetArrayStructFields`. It should consider both the original array's `containsNull` and the inner field's nullability.

### Why are the changes needed?

Fix a correctness issue.

### Does this PR introduce _any_ user-facing change?

Yes. See the added test.

### How was this patch tested?

a new UT and end-to-end test

Closes #29019 from cloud-fan/port.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125376 has finished for PR 29019 at commit d22707a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
…nullability together

Ref: LIHADOOP-56842
(cherry picked from commit 146062d)

Backport apache#28992 to 2.4

Fix nullability of `GetArrayStructFields`. It should consider both the original array's `containsNull` and the inner field's nullability.

Fix a correctness issue.

Yes. See the added test.

a new UT and end-to-end test

Closes apache#29019 from cloud-fan/port.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

RB=2459030
BUG=LIHADOOP-56842
G=spark-reviewers
R=zolin,ekrogen
A=ekrogen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants