Skip to content

pipeline function in QueryCache fails with ECONNRESET for lambda rollup pre-aggregations #9566

@viktordebulat

Description

@viktordebulat

Describe the bug
We are encountering an issue with pipeline in QueryCache.ts when processing lambda rollup pre-aggregations for ClickHouse. Specifically, the error occurs during the streaming of data from tableData.rowStream to the writer. The error is intermittent but consistently happens for certain request types after approximately 2 seconds:

Error: aborted
    at TLSSocket.socketCloseListener (node:_http_client:478:19)
    at TLSSocket.emit (node:events:530:35)
    at node:net:351:12
    at TCP.done (node:_tls_wrap:650:7) {
  code: 'ECONNRESET'

The problem goes away when replacing affected code with direct iterator processing:

const iterator = tableData.rowStream[Symbol.asyncIterator]();
let result = await iterator.next();
while (!result.done) {
  writer.write(result.value);
  result = await iterator.next();
}
writer.end();

This workaround resolves the issue, but it bypasses the pipeline utility, which is designed to handle stream piping and error propagation.

To Reproduce
Steps to reproduce the behavior:

  1. Create cube with ClickHouse as datasource, declare rollups and lambda rollups in preaggregations section.
  2. Trigger a request that processes a large dataset or involves a lambda running.

Expected behavior
The pipeline function should handle the streaming of data without prematurely aborting due to ECONNRESET.

Minimally reproducible Cube Schema

cubes:
  - name: cube_total
    sql: >
       some select sql here

    measures:
      - name: total_transactions
        sql: transaction_id
        type: count

      - name: total_amount
        sql: amount
        type: sum

      - name: total_payout
        sql: payout
        type: sum

    dimensions:
      - name: user_id
        sql: user_id
        type: string

      - name: currency
        sql: currency
        type: string

      - name: at
        sql: at
        type: time

    pre_aggregations:
      - name: cube_total_rollup_lambda
        type: rollup_lambda
        union_with_source_data: true
        rollups:
          - CUBE.cube_total_rollup

      - name: cube_total_rollup
        type: rollup
        measures:
          - cube_total.total_transactions
          - cube_total.total_amount
          - cube_total.total_payout
        dimensions:
          - cube_total.user_id
          - cube_total.currency
        indexes:
          - name: user_rollup_user_id_index
            columns:
              - cube_total.user_id
        time_dimension: cube_total.at
        granularity: quarter
        external: true
        partition_granularity: quarter
        refresh_key:
          every: 1 day

Version:
Cube: 1.3.10, 1.3.11, 1.3.12...
ClickHouse: 25.3

Additional context
The issue occurs specifically for lambda rollup pre-aggregations. Other request types using the same pipeline function do not exhibit this behavior. This suggests the issue may be related to the characteristics of the tableData.rowStream for these specific requests.

I suggest, there should be improvements in error handling for pipeline and/or, probably some retries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    driver:clickhouseIssues related to the ClickHouse driver

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions