Skip to content

Comet Internal Error: Output column count mismatch: expected 0, got 1 #1251

@andygrove

Description

@andygrove

Describe the bug

When we fix #1248 we find a new bug that causes the following Spark SQL test to fail:

2025-06-02T17:20:49.4109714Z [info] - subquery/exists-subquery/exists-orderby-limit.sql *** FAILED *** (4 seconds, 39 milliseconds)
2025-06-02T17:20:49.4117435Z [info]   subquery/exists-subquery/exists-orderby-limit.sql
2025-06-02T17:20:49.4118504Z [info]   Expected Some("struct<id:int,emp_name:string,hiredate:date,salary:double,dept_id:int>"), but got Some("struct<>") Schema did not match for query #19
2025-06-02T17:20:49.4119489Z [info]   SELECT *
2025-06-02T17:20:49.4119782Z [info]   FROM   emp
2025-06-02T17:20:49.4120140Z [info]   WHERE  EXISTS (SELECT max(dept.dept_id)
2025-06-02T17:20:49.4120775Z [info]                  FROM   dept
2025-06-02T17:20:49.4121180Z [info]                  GROUP  BY state
2025-06-02T17:20:49.4121871Z [info]                  LIMIT  1
2025-06-02T17:20:49.4123859Z [info]                  OFFSET 2): -- !query
2025-06-02T17:20:49.4124345Z [info]   SELECT *
2025-06-02T17:20:49.4125150Z [info]   FROM   emp
2025-06-02T17:20:49.4125753Z [info]   WHERE  EXISTS (SELECT max(dept.dept_id)
2025-06-02T17:20:49.4126359Z [info]                  FROM   dept
2025-06-02T17:20:49.4126942Z [info]                  GROUP  BY state
2025-06-02T17:20:49.4137110Z [info]                  LIMIT  1
2025-06-02T17:20:49.4141050Z [info]                  OFFSET 2)
2025-06-02T17:20:49.4142949Z [info]   -- !query schema
2025-06-02T17:20:49.4143313Z [info]   struct<>
2025-06-02T17:20:49.4143655Z [info]   -- !query output
2025-06-02T17:20:49.4144088Z [info]   java.util.concurrent.ExecutionException
2025-06-02T17:20:49.4146087Z [info]   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 23265.0 failed 1 times, most recent failure: Lost task 0.0 in stage 23265.0 (TID 22042) (a982f938b356 executor driver): org.apache.comet.CometNativeException: Comet Internal Error: Output column count mismatch: expected 0, got 1

Steps to reproduce

Add this test to CometExecSuite:

  // repro for https://github.com/apache/datafusion-comet/issues/1251
  test("subquery/exists-subquery/exists-orderby-limit.sql") {
    withSQLConf(CometConf.COMET_SHUFFLE_MODE.key -> "jvm",
      CometConf.COMET_EXPLAIN_NATIVE_ENABLED.key -> "true") {
      val table = "src"
      withTable(table) {
        sql(s"CREATE TABLE $table (key INT, value STRING) USING PARQUET")
        sql(s"INSERT INTO $table VALUES(238, 'val_238')")

        // this query works correctly if the GROUP BY is removed
        checkSparkAnswerAndOperator(
          s"""SELECT * FROM $table
             |WHERE EXISTS (SELECT MAX(key)
             |FROM $table
             |GROUP BY value
             |LIMIT 1
             |OFFSET 2)""".stripMargin)
      }
    }
  }

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions