[EPIC] Spark SQL test failures when Comet JVM shuffle is used

### What is the problem the feature request solves?

https://github.com/apache/datafusion-comet/pull/1209 enabled accelerating more queries with Comet but also exposed some new bugs so a COMET_SHUFFLE_FALLBACK_TO_COLUMNAR config was added as a temporary workaround to avoid Spark SQL tests failing.

We should remove this config once the bugs are fixed so that we test more of the Spark SQL queries with Comet shuffle.

Issues:

Failing tests:

**core1**

- SPARK-32038: NormalizeFloatingNumbers should work on distinct aggregate *** FAILED *** (551 milliseconds)
  - https://github.com/apache/datafusion-comet/issues/1824
- update nested struct fields *** FAILED *** (98 milliseconds)
  - https://github.com/apache/datafusion-comet/issues/1823
- merge with updates to nested struct fields in NOT MATCHED BY SOURCE clauses *** FAILED *** (94 milliseconds)
  - https://github.com/apache/datafusion-comet/issues/1823
- WholeStageCodeGenSuite
  - https://github.com/apache/datafusion-comet/issues/1852
- https://github.com/apache/datafusion-comet/issues/1252

**core2**

- subquery/exists-subquery/exists-orderby-limit.sql *** FAILED *** (4 seconds, 72 milliseconds)
  - https://github.com/apache/datafusion-comet/issues/1251
- typeCoercion/native/windowFrameCoercion.sql *** FAILED *** (1 second, 540 milliseconds)
  - https://github.com/apache/datafusion-comet/issues/1246

**core3**

- SPARK-32649: Optimize BHJ/SHJ inner/semi join with empty hashed relation *** FAILED *** (145 milliseconds)
  - assertion fails on number of shuffled hash joins in plan
- SPARK-38237: require all cluster keys for child required distribution for window query *** FAILED *** (183 milliseconds)
  - shuffleByRequirement was false Can't find desired shuffle node from the query plan [test needs updating]


### Describe the potential solution

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[EPIC] Spark SQL test failures when Comet JVM shuffle is used #1254

What is the problem the feature request solves?

Describe the potential solution

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[EPIC] Spark SQL test failures when Comet JVM shuffle is used #1254

Description

What is the problem the feature request solves?

Describe the potential solution

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions