fix: [comet-parquet-exec] fix regressions original comet native scal implementation #1170

parthchandra · 2024-12-13T16:55:07Z

Changes in CometScanExec to support the full native reader caused the original reader. This fixes the issues so that all Comet unit tests pass with the original native scan. The only tests that do not pass are the plan stability tests where the 'expected' plans expect CometNativeScan but if full native is not turned on, the plan produces a CometScan (which is expected).

…TIVE_SCAN is enabled

…in")

parthchandra · 2024-12-13T16:55:58Z

common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java

-      Preconditions.checkState(
-          t.isPrimitive() && !t.isRepetition(Type.Repetition.REPEATED),
-          "Complex type is not supported");
+      //      Preconditions.checkState(


This is for the second implementation. Will remove this later

parthchandra · 2024-12-13T16:57:43Z

spark/src/main/scala/org/apache/comet/DataTypeSupport.scala

      true
    case t: DataType if t.typeName == "timestamp_ntz" => true
-    case _: StructType => true
+    case _: StructType


With only the original scan enabled, this caused CometScan to be used for Struct types (which it did not support).

parthchandra · 2024-12-13T17:02:03Z

spark/src/main/scala/org/apache/spark/sql/comet/CometScanExec.scala


-  override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) =
-    (wrapped.outputPartitioning, wrapped.outputOrdering)
+  override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) = {


@viirya - The previous fix to address outputPartitioning (using inputRDD) was not correct and caused multiple test failures. This is a different attempt (and at least all the tests pass).
If this is a bucketedScan, we fall back to the wrapped FileSourceScanLike implementation but for non bucketed case since FileSourceScanLike always returned 0 partitions, we override the behaviour, setting the num of partitions to the number of files.
I'm not entirely sure this covers all cases so please advise.

parthchandra added 4 commits December 12, 2024 18:33

fix: CometScanExec was created for unsupported cases if only COMET_NA…

7633972

…TIVE_SCAN is enabled

fix: Another try to fix ' test("Comet native metrics: BroadcastHashJo…

bfaecb0

…in")

fix: some tests are valid only when full native scan is enabled

a7fad04

Merge pull request #1 from andygrove/fix-tests-spark-cast-options

1f3a3ee

andygrove changed the title ~~fix: fix regressions original comet native scal implementation~~ fix: [comet-parquet-exec] fix regressions original comet native scal implementation Dec 13, 2024

andygrove approved these changes Dec 13, 2024

View reviewed changes

andygrove merged commit 8563edf into apache:comet-parquet-exec Dec 13, 2024
16 of 72 checks passed

parthchandra commented Dec 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: [comet-parquet-exec] fix regressions original comet native scal implementation #1170

fix: [comet-parquet-exec] fix regressions original comet native scal implementation #1170

Uh oh!

parthchandra commented Dec 13, 2024

Uh oh!

Uh oh!

parthchandra Dec 13, 2024

Uh oh!

parthchandra Dec 13, 2024

Uh oh!

parthchandra Dec 13, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: [comet-parquet-exec] fix regressions original comet native scal implementation #1170

fix: [comet-parquet-exec] fix regressions original comet native scal implementation #1170

Uh oh!

Conversation

parthchandra commented Dec 13, 2024

Uh oh!

Uh oh!

parthchandra Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

parthchandra Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

parthchandra Dec 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

parthchandra Dec 13, 2024 •

edited

Loading