[SPARK-4502][SQL]Support parquet nested struct pruning and add relevant test #14957

xuanyuanking · 2016-09-05T08:49:54Z

What changes were proposed in this pull request?

Like the description in SPARK-4502, we have the same problem in Baidu and our user's parquet file has complex nested parquet struct(400+ fields and 4 layer nested) so this problem brings unnecessary data read and time spend. This pr fixed the problem and main fix ideas list as follows:

Add booleanConf spark.sql.parquet.nestColumnPruning, when it’s closed, same logical with before
In FileSourceStrategy, traverse projects[NamedExpression] and generate the access path of nested struct.
For example: in query select people.addr.city from table
project getStructField('city', getStructField('addr', AttributeRefence('people'))) will have the access path ['people', 'addr', 'city']
get StructField recursively by the access path
Merge structType of fields in same structType and merge filter attributes
For example: the json format of struct type

root
    |-- people: struct (nullable = true)
    |-- addr: struct (nullable = true)
    |    |-- city: string (nullable = true)

and

root
    |-- people: struct (nullable = true)
    |-- addr: struct (nullable = true)
    |    |-- zip_code: string (nullable = true)

will merge to

root
    |-- people: struct (nullable = true)
    |-- addr: struct (nullable = true)
    |    |-- city: string (nullable = true)
    |    |-- zip_code: string (nullable = true)

How was this patch tested?

add new test in ParquetQuerySuite

…nt tests

HyukjinKwon · 2016-09-06T00:50:30Z

sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala

@@ -259,8 +259,23 @@ case class StructType(fields: Array[StructField]) extends DataType with Seq[Stru
   * @throws IllegalArgumentException if a field with the given name does not exist
   */
  def apply(name: String): StructField = {
-    nameToField.getOrElse(name,
-      throw new IllegalArgumentException(s"""Field "$name" does not exist."""))
+    if (name.contains('.')) {


IIUC, this will drop the support to access the field name containing . (e.g. "a.b") which can be accessed via "a.b". Could you confirm this please?

@HyukjinKwon Thanks for your review, mix the recursively get with the default apply has this problem, I fixed it in next patch and use ',' which is a invalid character in Parquet schema

HyukjinKwon · 2016-09-07T04:06:47Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

@@ -97,7 +98,16 @@ object FileSourceStrategy extends Strategy with Logging {
        dataColumns
          .filter(requiredAttributes.contains)
          .filterNot(partitionColumns.contains)
-      val outputSchema = readDataColumns.toStructType
+      val outputSchema = if (fsRelation.sqlContext.conf.isParquetNestColumnPruning) {


It will affect all other data sources. I am pretty sure any tests related with this will not pass.

I run all datasource tests by
./build/sbt "test-only org.apache.spark.sql.execution.datasources.*"
three test failed, but run failed suit seperately, all tests can passwd

All three tests failed because a datetime check error, correct answer is like '2015-05-23' but the spark answer is '2015-05-22', I don't think this error is made by my patch.
Do anybody have the same problem before? it really confusing me and running test suit seperate it will pass!

Oh, I meant enabling/disabling affects the other data sources. I see it's disabled by default. I ran ,for example, JsonSuite after manually enabling this option (after leaving the comment above) and saw some failures related with nested structures.

Also, do you mind if I ask which tests were failed? I will try to reproduce by myself.

It's my mistake here, I can only make sure this patch work for parquet,so I should check the fileFormat here, also like the config namespace(spark.sql.parquet.nestColumnPruning), it can only work for parquet. I add a patch to fix this.

I ran the command

./build/sbt "test-only org.apache.spark.sql.execution.datasources.*"

locally and three test suits failed

[error] Failed tests: [error] org.apache.spark.sql.execution.datasources.csv.CSVSuite [error] org.apache.spark.sql.execution.datasources.json.JsonSuite [error] org.apache.spark.sql.execution.datasources.parquet.ParquetPartitionDiscoverySuite

But I run them separately, all three pass. Also I run

./build/sbt "test-only org.apache.spark.sql.execution.csv.*" ./build/sbt "test-only org.apache.spark.sql.execution.json.*"

all tests pass.

HyukjinKwon · 2016-09-07T04:12:43Z

Could you please check out if related tests pass locally? It seems it affects all other data sources.

~~Also, I am not sure of the approach here. Marking nested fields by modifying column names does not look a good idea to me.~~ This concern was before 1c34877

HyukjinKwon · 2016-09-07T04:18:16Z

Also, it seems you might need to update your PR description. It seems the last commit you just pushed acts differently with your PR description. In addition, maybe you would need to fix the title of this PR to be complete (without ...) if you'd like to keep this PR open.

HyukjinKwon · 2016-09-07T16:03:57Z

(BTW, I just left some comments because I am interested in some codes in this path. You can wait for committer's review.)

xuanyuanking · 2016-09-08T01:47:16Z

(Thanks for your comments :) )

…tFileFormat

xuanyuanking · 2016-09-10T01:33:33Z

@liancheng @rxin

rxin · 2016-10-07T07:42:13Z

This would be good to include in Spark 2.1...

cc @davies and @liancheng

liancheng · 2016-10-21T01:17:40Z

test this please

liancheng · 2016-10-21T01:18:37Z

add to whitelist

SparkQA · 2016-10-21T03:22:34Z

Test build #67316 has finished for PR 14957 at commit 23465ba.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng

Sorry for the late reply and thanks for the contribution!

I've been expected for this feature for a long time and had once come up with basically the same idea but didn't get time to implement it. Thanks!

I haven't finished reviewing everything but would like to post my comments for the first round of review so that we can iterate.

liancheng · 2016-10-21T01:28:15Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

+        // Merge schema in same StructType and merge with filterAttributes
+        prunedSchema.fields.map(f => StructType(Array(f))).reduceLeft(_ merge _)
+          .merge(filterAttributes.toSeq.toStructType)
+      } else readDataColumns.toStructType


Please re-format the above change to the following format:

if ( ... && ... ) { ... } else { ... }

liancheng · 2016-10-21T05:54:55Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

@@ -126,4 +136,52 @@ object FileSourceStrategy extends Strategy with Logging {

    case _ => Nil
  }
+
+  private def generateStructFieldsContainsNesting(projects: Seq[Expression],
+                                      totalSchema: StructType) : Seq[StructField] = {


Please check Spark code style guide and re-format this one.

Would you please add comments and test cases for testing this method, which is basically the essential part of this PR?

fix code style done.
No problem, I'll add tests for the private func generateStructFieldsContainsNesting next patch, this patch fix all code style and naming problem.

liancheng · 2016-10-21T05:55:02Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

+  private def generateStructFieldsContainsNesting(projects: Seq[Expression],
+                                      totalSchema: StructType) : Seq[StructField] = {
+    def generateStructField(curField: List[String],
+                             node: Expression) : Seq[StructField] = {


And this one.

liancheng · 2016-10-21T05:55:19Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

+    }
+
+    def getFieldRecursively(totalSchema: StructType,
+                            name: List[String]): StructField = {


liancheng · 2016-10-21T05:57:10Z

sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -212,6 +212,11 @@ object SQLConf {
    .booleanConf
    .createWithDefault(true)

+  val PARQUET_NEST_COLUMN_PRUNING = SQLConfigBuilder("spark.sql.parquet.nestColumnPruning")
+    .doc("When set this to true, we will tell parquet only read the nest column`s leaf fields ")


Please reword this doc string to:

When true, Parquet column pruning also works for nested fields.

reword done

liancheng · 2016-10-21T05:59:02Z

sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -212,6 +212,11 @@ object SQLConf {
    .booleanConf
    .createWithDefault(true)

+  val PARQUET_NEST_COLUMN_PRUNING = SQLConfigBuilder("spark.sql.parquet.nestColumnPruning")


Please rename to PARQUET_NESTED_COLUMN_PRUNING and spark.sql.parquet.nestedColumnPruning respectively.

rename done

liancheng · 2016-10-21T05:59:37Z

sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -661,6 +666,8 @@ private[sql] class SQLConf extends Serializable with CatalystConf with Logging {

  def isParquetINT96AsTimestamp: Boolean = getConf(PARQUET_INT96_AS_TIMESTAMP)

+  def isParquetNestColumnPruning: Boolean = getConf(PARQUET_NEST_COLUMN_PRUNING)


parquetNestedColumnPruningEnabled

rename done

liancheng · 2016-10-21T06:01:42Z

...re/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala

+    //    |-- num: long (nullable = true)
+    //    |-- str: string (nullable = true)
+    val df = readResourceParquetFile("test-data/nested-struct.snappy.parquet")
+    df.createOrReplaceTempView("tmp_table")


You may use SQLTestUtils.withTempView to wrap this test so that you don't need to drop the temporary view manually.

liancheng · 2016-10-21T08:15:44Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

-      val outputSchema = readDataColumns.toStructType
+      val outputSchema = if (fsRelation.sqlContext.conf.isParquetNestColumnPruning
+        && fsRelation.fileFormat.isInstanceOf[ParquetFileFormat]) {
+        val totalSchema = readDataColumns.toStructType


Maybe fullSchema?

liancheng · 2016-10-21T23:13:07Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

+      } else {
+        totalSchema(name.head)
+      }
+    }


Actually this function can be simplified to:

def getNestedField(schema: StructType, path: Seq[String]): StructField = { require(path.nonEmpty, "<error message>") path.tail.foldLeft(schema(path.head)) { (field, name) => field.dataType match { case t: StructType => t(name) case _ => ??? // Throw exception here } } }

The func getFieldRecursively here need the return value which is a StructField contains all nested relation in path. For example:
The fullSchema is:

root |-- col: struct (nullable = true) | |-- s1: struct (nullable = true) | | |-- s1_1: long (nullable = true) | | |-- s1_2: long (nullable = true) | |-- str: string (nullable = true) |-- num: long (nullable = true) |-- str: string (nullable = true)

and when we want to get col.s1.s1_1, the func should return:

StructField(col,StructType(StructField(s1,StructType(StructField(s1_1,LongType,true)),true)),true)

So maybe I can't use the simplified func getNestedField because it returns only the last StructField:

StructField(s1_1,LongType,true)

davies · 2016-10-21T23:39:43Z

Together with this one, we should have a optimizer rule that could 1) extract GetStructField (and others) and push that down closer to the data source, or 2) flatten all the nested field in data source, then replace GetStructField as the flatten ones, the pruning not used ones.

SparkQA · 2016-10-23T11:28:30Z

Test build #67410 has finished for PR 14957 at commit ab8f5ec.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…sting

SparkQA · 2016-10-25T16:38:21Z

Test build #67515 has finished for PR 14957 at commit 92ed369.

This patch passes all tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2016-10-26T03:44:24Z

Test build #67550 has finished for PR 14957 at commit 5697911.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-26T07:56:27Z

Test build #67564 has finished for PR 14957 at commit d9aa397.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-10-21T23:47:15Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

+        case attr: AttributeReference =>
+          Seq(getFieldRecursively(totalSchema, attr.name :: curField))
+        case sf: GetStructField =>
+          generateStructField(sf.name.get :: curField, sf.child)


This name is optional and might not be set. We should retrieve the actual field name using the ordinal of sf.

fix done.
a little question here, all projects parse from sql must have the name while projects from dataframe api may not , right?

liancheng · 2016-10-28T21:07:46Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

-      val outputSchema = readDataColumns.toStructType
+      val outputSchema = if (
+          fsRelation.sqlContext.conf.parquetNestedColumnPruningEnabled &&
+          fsRelation.fileFormat.isInstanceOf[ParquetFileFormat]


Use two space indentation here.

liancheng · 2016-10-28T21:12:36Z

...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala

+    val result = FileSourceStrategy.invokePrivate[Seq[StructField]](testFunc(projects,
+      fullSchema))
+    assert(result == expextResult)
+  }


It would be nice to split this method into several test cases that test some typical but minimal cases.

BTW, I tried the following test code:

test("foo") { val schema = new StructType() .add("f0", IntegerType) .add("f1", new StructType() .add("f10", IntegerType)) val expr = GetStructField( CreateNamedStruct(Seq( Literal("f10"), AttributeReference("f0", IntegerType)() )), 0, Some("f10") ) StructType( FileSourceStrategy.generateStructFieldsContainsNesting(expr :: Nil, schema) ).printTreeString() }

and it fails with the following exception:

[info] - foo *** FAILED *** (37 milliseconds) [info] java.lang.IllegalArgumentException: Field "f0" is not struct field. [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$.getFieldRecursively$1(FileSourceStrategy.scala:188) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$.org$apache$spark$sql$execution$datasources$FileSourceStrategy$$generateStructField$1(FileSourceStrategy.scala:166) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$$anonfun$org$apache$spark$sql$execution$datasources$FileSourceStrategy$$generateStructField$1$1.apply(FileSourceStrategy.scala:171) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$$anonfun$org$apache$spark$sql$execution$datasources$FileSourceStrategy$$generateStructField$1$1.apply(FileSourceStrategy.scala:171) [info] at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) [info] at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) [info] at scala.collection.immutable.List.foreach(List.scala:381) [info] at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) [info] at scala.collection.immutable.List.flatMap(List.scala:344) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$.org$apache$spark$sql$execution$datasources$FileSourceStrategy$$generateStructField$1(FileSourceStrategy.scala:171) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$$anonfun$generateStructFieldsContainsNesting$1.apply(FileSourceStrategy.scala:195) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$$anonfun$generateStructFieldsContainsNesting$1.apply(FileSourceStrategy.scala:195) [info] at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) [info] at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) [info] at scala.collection.immutable.List.foreach(List.scala:381) [info] at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) [info] at scala.collection.immutable.List.flatMap(List.scala:344) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategy$.generateStructFieldsContainsNesting(FileSourceStrategy.scala:195) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategySuite$$anonfun$16.apply$mcV$sp(FileSourceStrategySuite.scala:462) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategySuite$$anonfun$16.apply(FileSourceStrategySuite.scala:446) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategySuite$$anonfun$16.apply(FileSourceStrategySuite.scala:446) [info] at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) [info] at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) [info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68) [info] at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) [info] at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) [info] at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) [info] at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategySuite.org$scalatest$BeforeAndAfterEach$$super$runTest(FileSourceStrategySuite.scala:42) [info] at org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:255) [info] at org.apache.spark.sql.execution.datasources.FileSourceStrategySuite.runTest(FileSourceStrategySuite.scala:42) [info] at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) [info] at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) [info] at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) [info] at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) [info] at scala.collection.immutable.List.foreach(List.scala:381) [info] at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) [info] at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) [info] at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) [info] at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) [info] at org.scalatest.Suite$class.run(Suite.scala:1424) [info] at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555) [info] at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) [info] at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:545) [info] at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31) [info] at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257) [info] at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256) [info] at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31) [info] at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:357) [info] at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:502) [info] at sbt.ForkMain$Run$2.call(ForkMain.java:296) [info] at sbt.ForkMain$Run$2.call(ForkMain.java:286) [info] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [info] at java.lang.Thread.run(Thread.java:745)

Basically, we also need to consider named_struct and struct expressions to get corner cases correct.

fix done. Thanks for liancheng's remind.
Here I considered the CreateStruct(Unsafe) and CreateNamedStruct(Unsafe), other expressions in complexTypeCreator(CreateArray, CreateMap) just ignore.

liancheng · 2016-10-28T21:12:49Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala

@@ -126,4 +140,59 @@ object FileSourceStrategy extends Strategy with Logging {

    case _ => Nil
  }
+
+  private def generateStructFieldsContainsNesting(


You may make this method private[sql] so that you don't need to rely on the PrivateMethod ScalaTest trick to test it.

SparkQA · 2016-10-30T12:24:04Z

Test build #67781 has finished for PR 14957 at commit d093c82.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

xuanyuanking

Add support for CreateStruct(Unsafe) and CreateNamedStruct(Unsafe)
Split origin test case to minimal test cases
Add test case for named_struct

saulshanabrook · 2017-05-18T01:04:48Z

@xuanyuanking Have you determined if the functionality provided here is superseded by #16578? I am trying to figure out which PR to help out on since I need this feature as well.

Gauravshah · 2017-05-23T05:19:08Z

@saulshanabrook looks like #16578 is a superset, trying to invest in that pull request.

HyukjinKwon · 2017-06-19T04:16:46Z

@xuanyuanking, let's close this and help review #16578 if you agree on the comments above.

xuanyuanking · 2017-06-27T01:24:59Z

OK, I'll close this and just use it in our internal env, thanks all guys's suggestion and review work. Next we may try more complex scenario of this.

[SPARK-4502][SQL]Support parquet nested struct pruning and add releva…

7eaa428

…nt tests

HyukjinKwon reviewed Sep 6, 2016
View reviewed changes

xuanyuanking added 2 commits September 7, 2016 10:43

change the seperator of recursive fields in nested struct pruning

46153f1

use ',' as the recursive fields seperator

46c2474

HyukjinKwon reviewed Sep 7, 2016
View reviewed changes

xuanyuanking changed the title ~~[SPARK-4502][SQL]Support parquet nested struct pruning and add releva…~~ [SPARK-4502][SQL]Support parquet nested struct pruning and add relevant test Sep 7, 2016

Get the nested fields not modifying the column names

1c34877

Add fileFormat check for nested fields pruning, only works for parque…

23465ba

…tFileFormat

liancheng reviewed Oct 21, 2016

View reviewed changes

fix code style and variable name

ab8f5ec

add comments and test cases for method generateStructFieldsContainsNe…

92ed369

…sting

sync with master, resolve conflict

5697911

fix test build

d9aa397

liancheng reviewed Oct 28, 2016

View reviewed changes

support named_struct and struct in schema pruning

d093c82

xuanyuanking commented Nov 7, 2016

View reviewed changes

HyukjinKwon mentioned this pull request Jan 15, 2017

[SPARK-4502][SQL] Parquet nested column pruning #16578

Closed

HyukjinKwon mentioned this pull request Jun 25, 2017

[INFRA] Close stale PRs #18417

Closed

xuanyuanking closed this Jun 27, 2017

		@@ -661,6 +666,8 @@ private[sql] class SQLConf extends Serializable with CatalystConf with Logging {

		def isParquetINT96AsTimestamp: Boolean = getConf(PARQUET_INT96_AS_TIMESTAMP)

		def isParquetNestColumnPruning: Boolean = getConf(PARQUET_NEST_COLUMN_PRUNING)

[SPARK-4502][SQL]Support parquet nested struct pruning and add relevant test #14957

[SPARK-4502][SQL]Support parquet nested struct pruning and add relevant test #14957

Uh oh!

Conversation

xuanyuanking commented Sep 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

HyukjinKwon Sep 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Sep 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuanyuanking Sep 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Sep 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Sep 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Sep 7, 2016

Uh oh!

HyukjinKwon commented Sep 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuanyuanking commented Sep 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuanyuanking commented Sep 10, 2016

Uh oh!

rxin commented Oct 7, 2016

Uh oh!

liancheng commented Oct 21, 2016

Uh oh!

liancheng commented Oct 21, 2016

Uh oh!

SparkQA commented Oct 21, 2016

Uh oh!

liancheng left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuanyuanking commented Sep 5, 2016 •

edited

Loading

HyukjinKwon Sep 6, 2016 •

edited

Loading

HyukjinKwon Sep 7, 2016 •

edited

Loading

xuanyuanking Sep 7, 2016 •

edited

Loading

HyukjinKwon Sep 7, 2016 •

edited

Loading

HyukjinKwon commented Sep 7, 2016 •

edited

Loading

HyukjinKwon commented Sep 7, 2016 •

edited

Loading

xuanyuanking commented Sep 8, 2016 •

edited

Loading

liancheng left a comment •

edited

Loading

liancheng Oct 21, 2016 •

edited

Loading

xuanyuanking Oct 23, 2016 •

edited

Loading