Skip to content

ProjectionPushdown / ExprBoundaries col_index out of bounds when running multi-stage query against CubeStore pre-aggregation #10439

@draymondxie

Description

@draymondxie

Describe the Bug:

When executing a query that includes a multi_stage: true YoY measure (using time_shift) against an external pre-aggregation stored in CubeStore, the /v1/load endpoint returns an internal DataFusion error from CubeStore's ProjectionPushdown optimizer.

Error Message:

{
"error": "Error: Internal: ProjectionPushdown\ncaused by\nInternal error: Could not create ExprBoundaries: in try_from_column col_index \n has gone out of bounds with a value of 6, the schema has 6 columns..\nThis was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker"
}

Environment:

Component Version
@cubejs-backend/server 1.6.16
CubeStore 1.6.16
Node.js 24.x
Database Trino (Hive catalog)
Platform Linux (Red Hat UBI 9)
CUBEJS_TESSERACT_SQL_PLANNER true
CUBEJS_TESSERACT_PRE_AGGREGATIONS true

Steps to Reproduce:

  1. Cube Model (simplified)
cubes:
  - name: biz_mkt_volume
    sql_table: biz_mkt_volume
    data_source: default
    dimensions:
      - name: src
        sql: src
        type: string
      - name: por_country_cde
        sql: por_country_cde
        type: string
      - name: por_country_nme
        sql: por_country_nme
        type: string
      - name: hs2_code
        sql: hs2_code
        type: string
      - name: hs2_desc
        sql: hs2_desc
        type: string
      - name: dep_arr_date_ts
        sql: "date_parse(CAST(dep_arr_date AS VARCHAR), '%Y%m%d')"
        type: time
    measures:
      - name: teu
        sql: teu
        type: sum
      - name: teu_py
        type: number
        multi_stage: true
        sql: "{teu}"
        time_shift:
          - time_series: dep_arr_date_ts
            interval: 1 year
      - name: teu_yoy_pct
        type: number
        multi_stage: true
        sql: >
          CASE WHEN {teu_py} > 0
            THEN ({teu} - {teu_py}) * 100.0 / {teu_py}
            ELSE NULL
          END
    pre_aggregations:
      - name: rollup_yoy
        type: rollup
        measures:
          - teu
        dimensions:
          - src
          - por_country_cde
          - por_country_nme
          - hs2_code
          - hs2_desc
        time_dimension: dep_arr_date_ts
        granularity: year
        partition_granularity: year
        refresh_key:
          every: 1 day

views:
  - name: market_volume_fixed_period_comparison
    cubes:
      - join_path: biz_mkt_volume
        includes: "*"
  1. Environment Variables:
CUBEJS_TESSERACT_SQL_PLANNER=true
CUBEJS_TESSERACT_PRE_AGGREGATIONS=true
CUBEJS_CUBESTORE_HOST=localhost
CUBEJS_CUBESTORE_PORT=3030
  1. Query sent to POST /cubejs-api/v1/load:
{
  query: {
    dimensions: [
      market_volume_fixed_period_comparison.por_country_cde,
      market_volume_fixed_period_comparison.por_country_nme,
      market_volume_fixed_period_comparison.hs2_code,
      market_volume_fixed_period_comparison.hs2_desc
    ],
    measures: [
      market_volume_fixed_period_comparison.teu,
      market_volume_fixed_period_comparison.teu_yoy_pct
    ],
    timeDimensions: [
      {
        dimension: market_volume_fixed_period_comparison.dep_arr_date_ts,
        dateRange: [2024-01-01, 2025-12-31],
        granularity: year
      }
    ],
    filters: [
      {
        member: market_volume_fixed_period_comparison.src,
        operator: equals,
        values: [OB]
      },
      {
        member: market_volume_fixed_period_comparison.teu,
        operator: gte,
        values: [1000]
      }
    ],
    limit: 20
  }
}

Expected Behavior:
The query returns YoY comparison data (TEU and TEU YoY % change) grouped by country and HS2 code for the selected year range.

Actual Behavior:
CubeStore returns an internal DataFusion ProjectionPushdown error (see above). The query never completes.


Generated SQL (from POST /cubejs-api/v1/sql):

Click to expand full generated SQL WITH cte_0 AS ( SELECT "biz_mkt_volume__por_country_cde" "market_volume_fixed_period_comparison__por_country_cde", "biz_mkt_volume__por_country_nme" "market_volume_fixed_period_comparison__por_country_nme", "biz_mkt_volume__hs2_code" "market_volume_fixed_period_comparison__hs2_code", "biz_mkt_volume__hs2_desc" "market_volume_fixed_period_comparison__hs2_desc", date_trunc('year', ("biz_mkt_volume__dep_arr_date_ts_year" + interval '1 year')) "market_volume_fixed_period_comparison__dep_arr_date_ts_year", sum("biz_mkt_volume__teu") "biz_mkt_volume__teu" FROM dev_pre_aggregations.biz_mkt_volume_rollup_yoy WHERE ("biz_mkt_volume__dep_arr_date_ts_year" + interval '1 year') >= CAST(? AS TIMESTAMP) AND ("biz_mkt_volume__dep_arr_date_ts_year" + interval '1 year') <= CAST(? AS TIMESTAMP) AND "biz_mkt_volume__src" = ? GROUP BY 1, 2, 3, 4, 5 ORDER BY 5 ASC ), cte_1 AS ( SELECT "fk_aggregate"."market_volume_fixed_period_comparison__por_country_cde", "fk_aggregate"."market_volume_fixed_period_comparison__por_country_nme", "fk_aggregate"."market_volume_fixed_period_comparison__hs2_code", "fk_aggregate"."market_volume_fixed_period_comparison__hs2_desc", "fk_aggregate"."market_volume_fixed_period_comparison__dep_arr_date_ts_year", "fk_aggregate"."biz_mkt_volume__teu" "biz_mkt_volume__teu_py" FROM (SELECT * FROM cte_0) AS "fk_aggregate" ), cte_2 AS ( SELECT "biz_mkt_volume__por_country_cde" "market_volume_fixed_period_comparison__por_country_cde", "biz_mkt_volume__por_country_nme" "market_volume_fixed_period_comparison__por_country_nme", "biz_mkt_volume__hs2_code" "market_volume_fixed_period_comparison__hs2_code", "biz_mkt_volume__hs2_desc" "market_volume_fixed_period_comparison__hs2_desc", date_trunc('year', "biz_mkt_volume__dep_arr_date_ts_year") "market_volume_fixed_period_comparison__dep_arr_date_ts_year", sum("biz_mkt_volume__teu") "biz_mkt_volume__teu" FROM dev_pre_aggregations.biz_mkt_volume_rollup_yoy WHERE "biz_mkt_volume__dep_arr_date_ts_year" >= CAST(? AS TIMESTAMP) AND "biz_mkt_volume__dep_arr_date_ts_year" <= CAST(? AS TIMESTAMP) AND "biz_mkt_volume__src" = ? GROUP BY 1, 2, 3, 4, 5 ORDER BY 5 ASC ), cte_3 AS ( SELECT "fk_aggregate_keys"."market_volume_fixed_period_comparison__por_country_cde", "fk_aggregate_keys"."market_volume_fixed_period_comparison__por_country_nme", "fk_aggregate_keys"."market_volume_fixed_period_comparison__hs2_code", "fk_aggregate_keys"."market_volume_fixed_period_comparison__hs2_desc", "fk_aggregate_keys"."market_volume_fixed_period_comparison__dep_arr_date_ts_year", CASE WHEN "q_0"."biz_mkt_volume__teu_py" > 0 THEN ("q_1"."biz_mkt_volume__teu" - "q_0"."biz_mkt_volume__teu_py") * 100.0 / "q_0"."biz_mkt_volume__teu_py" ELSE NULL END "biz_mkt_volume__teu_yoy_pct" FROM ( SELECT DISTINCT "market_volume_fixed_period_comparison__por_country_cde", "market_volume_fixed_period_comparison__por_country_nme", "market_volume_fixed_period_comparison__hs2_code", "market_volume_fixed_period_comparison__hs2_desc", "market_volume_fixed_period_comparison__dep_arr_date_ts_year" FROM ( SELECT DISTINCT ... FROM cte_1 UNION ALL SELECT DISTINCT ... FROM cte_2 ) AS "pk_aggregate_keys_source" ) AS "fk_aggregate_keys" LEFT JOIN (SELECT * FROM cte_1) AS "q_0" ON <5-column key join> LEFT JOIN (SELECT * FROM cte_2) AS "q_1" ON <5-column key join> ) SELECT "fk_aggregate_keys"."market_volume_fixed_period_comparison__por_country_cde", "fk_aggregate_keys"."market_volume_fixed_period_comparison__por_country_nme", "fk_aggregate_keys"."market_volume_fixed_period_comparison__hs2_code", "fk_aggregate_keys"."market_volume_fixed_period_comparison__hs2_desc", "fk_aggregate_keys"."market_volume_fixed_period_comparison__dep_arr_date_ts_year", "q_0"."biz_mkt_volume__teu" "market_volume_fixed_period_comparison__teu", "q_1"."biz_mkt_volume__teu_yoy_pct" "market_volume_fixed_period_comparison__teu_yoy_pct" FROM ( ... UNION ALL SELECT DISTINCT ... FROM cte_3 ) AS "fk_aggregate_keys" LEFT JOIN ( ... HAVING sum(teu) >= ? ) AS "q_0" ON <5-column key join> LEFT JOIN ( SELECT * FROM cte_3 ) AS "q_1" ON <5-column key join> WHERE "q_0"."biz_mkt_volume__teu" >= ? ORDER BY 5 ASC LIMIT 20 Parameters (in order): 2024-01-01T00:00:00.000, 2025-12-31T23:59:59.999, OB, 2024-01-01T00:00:00.000, 2025-12-31T23:59:59.999, OB, 2024-01-01T00:00:00.000, 2025-12-31T23:59:59.999, OB, 1000, 2024-01-01T00:00:00.000, 2025-12-31T23:59:59.999, OB, 1000, 1000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions