Skip to content

[BUG] Calcite PPL doesn't handle array value columns if codegen triggered #5065

@qianheng-aws

Description

@qianheng-aws

What is the bug?
If there is codegen triggered, i.e. not fully pushed down into DSL, PPL will fail when performing any array operation on an array value field.

How can one reproduce the bug?
Steps to reproduce the behavior:

###
POST localhost:9200/test_idx/_doc
Content-Type: application/json

{ "nums": [1, 2, 3] }

###
POST localhost:9200/_plugins/_ppl
Content-Type: application/json

{
  "query": "source = test_idx | expand nums"
}

Actual Result:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "UNNEST argument must be a collection",
    "type": "CalciteException"
  },
  "status": 500
}

Exception and stack-trace in cluster:

[2026-01-23T13:16:27,185][ERROR][o.o.s.p.r.RestPPLQueryAction] [integTest-0] Error happened during query handling
org.apache.calcite.runtime.CalciteException: UNNEST argument must be a collection
        at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
        at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:511)
        at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:605)
        at org.apache.calcite.rel.core.Uncollect.deriveUncollectRowType(Uncollect.java:177)
        at org.apache.calcite.rel.core.Uncollect.deriveRowType(Uncollect.java:131)
        at org.apache.calcite.rel.core.Uncollect.<init>(Uncollect.java:82)
        at org.apache.calcite.tools.RelBuilder.uncollect(RelBuilder.java:2368)
        at org.opensearch.sql.calcite.CalciteRelNodeVisitor.buildExpandRelNode(CalciteRelNodeVisitor.java:3354)
        at org.opensearch.sql.calcite.CalciteRelNodeVisitor.visitExpand(CalciteRelNodeVisitor.java:3084)
        at org.opensearch.sql.calcite.CalciteRelNodeVisitor.visitExpand(CalciteRelNodeVisitor.java:168)
        at org.opensearch.sql.ast.tree.Expand.accept(Expand.java:41)
        at org.opensearch.sql.ast.AbstractNodeVisitor.visitChildren(AbstractNodeVisitor.java:111)
        at org.opensearch.sql.calcite.CalciteRelNodeVisitor.visitProject(CalciteRelNodeVisitor.java:370)
        at org.opensearch.sql.calcite.CalciteRelNodeVisitor.visitProject(CalciteRelNodeVisitor.java:168)
        at org.opensearch.sql.ast.tree.Project.accept(Project.java:65)
        at org.opensearch.sql.calcite.CalciteRelNodeVisitor.analyze(CalciteRelNodeVisitor.java:181)
        at org.opensearch.sql.executor.QueryService.analyze(QueryService.java:256)
        at org.opensearch.sql.executor.QueryService.lambda$executeWithCalcite$0(QueryService.java:106)
        at org.opensearch.sql.calcite.CalcitePlanContext.run(CalcitePlanContext.java:161)
        at org.opensearch.sql.executor.QueryService.executeWithCalcite(QueryService.java:96)
        at org.opensearch.sql.executor.QueryService.execute(QueryService.java:73)
        at org.opensearch.sql.executor.execution.QueryPlan.execute(QueryPlan.java:66)
        at org.opensearch.sql.opensearch.executor.OpenSearchQueryManager.lambda$schedule$1(OpenSearchQueryManager.java:64)
        at org.opensearch.sql.opensearch.executor.OpenSearchQueryManager.lambda$withCurrentContext$2(OpenSearchQueryManager.java:89)
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:916)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Root Cause Analysis

OpenSearch index doesn't has type of ARRAY, and the dynamic mapping parses the field nums to be type of long. So our type system translate that into BasicSqlType(SqlTypeName.BIGINT) in Calcite while Calcite's Uncollect operator only accepts ArraySqlType

What is the expected behavior?
The query should succeed with nums exploded.

What is your host/environment?

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Related issues: #4173, it has fixed by #5015 while that PR cannot handle the above case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languagebugSomething isn't working

    Type

    No type

    Projects

    Status

    Not Started

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions