Skip to content

[EPIC] Spark SQL test failures when Comet JVM shuffle is used #1254

@andygrove

Description

@andygrove

What is the problem the feature request solves?

#1209 enabled accelerating more queries with Comet but also exposed some new bugs so a COMET_SHUFFLE_FALLBACK_TO_COLUMNAR config was added as a temporary workaround to avoid Spark SQL tests failing.

We should remove this config once the bugs are fixed so that we test more of the Spark SQL queries with Comet shuffle.

Issues:

Failing tests:

core1

core2

core3

  • SPARK-32649: Optimize BHJ/SHJ inner/semi join with empty hashed relation *** FAILED *** (145 milliseconds)
    • assertion fails on number of shuffled hash joins in plan
  • SPARK-38237: require all cluster keys for child required distribution for window query *** FAILED *** (183 milliseconds)
    • shuffleByRequirement was false Can't find desired shuffle node from the query plan [test needs updating]

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions