Skip to content

[FLINK-39030][Table SQL/API] Support emit-only-on-update mode for CUMULATE window TVF to reduce unnecessary outputs#27533

Open
Myracle wants to merge 1 commit intoapache:masterfrom
Myracle:FLINK-39030-Support-TVF-emit-only-on-update
Open

[FLINK-39030][Table SQL/API] Support emit-only-on-update mode for CUMULATE window TVF to reduce unnecessary outputs#27533
Myracle wants to merge 1 commit intoapache:masterfrom
Myracle:FLINK-39030-Support-TVF-emit-only-on-update

Conversation

@Myracle
Copy link
Contributor

@Myracle Myracle commented Feb 5, 2026

What is the purpose of the change

This pull request adds support for emit-only-on-update mode in the CUMULATE window Table-Valued Function (TVF). When enabled, the CUMULATE window will skip emitting results for step intervals where no new data has arrived, reducing unnecessary duplicate outputs in sparse data streams. This optimization is particularly useful for scenarios where data arrives intermittently, as it avoids repeatedly outputting the same aggregation results.

Brief change log

  • Added a new optional boolean parameter emitOnlyOnUpdate (6th parameter) to the CUMULATE window TVF SQL function
  • Extended CumulativeWindowSpec to support the new emitOnlyOnUpdate field with JSON serialization/deserialization
  • Modified CumulativeSliceAssigner in SliceAssigners to track and expose the emitOnlyOnUpdate configuration
  • Updated SliceSharedSyncStateWindowAggProcessor and AsyncStateSliceSharedWindowAggProcessor to implement the emit-only-on-update logic by tracking whether new data arrived in the current step interval
  • Improved SqlWindowTableFunction.checkIntervalOperands() to skip boolean parameters dynamically instead of hardcoded parameter count check
  • Added integration tests for the new feature with various scenarios (enabled/disabled comparison, multi-key grouping)
  • Updated documentation (both English and Chinese) with parameter description and usage examples

Verifying this change

This change added tests and can be verified as follows:

  • Added testCumulateWindowWithEmitOnlyOnUpdate() integration test that verifies windows are skipped when no new data arrives in a step interval
  • Added testCumulateWindowWithEmitOnlyOnUpdateDisabled() integration test that verifies default behavior (emitOnlyOnUpdate=false) outputs all cumulating windows
  • Added testCumulateWindowWithEmitOnlyOnUpdateAndMultiKey() integration test that verifies the feature works correctly with multiple grouping keys
  • Manually verified the change by comparing output row counts between emitOnlyOnUpdate=true and emitOnlyOnUpdate=false modes

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no (only added new optional field with backward-compatible JSON deserialization)
  • The runtime per-record code paths (performance sensitive): yes (added a boolean check in the window aggregation output path, but the overhead is minimal)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs / JavaDocs
    • Updated docs/content/docs/dev/table/sql/queries/window-tvf.md (English)
    • Updated docs/content.zh/docs/dev/table/sql/queries/window-tvf.md (Chinese)
    • Added JavaDoc comments for the new parameter in SqlCumulateTableFunction.java

…ULATE window TVF to reduce unnecessary outputs
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 5, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants