Skip to content

[Refactor] Remove non-pipeline dead code from exec nodes#70133

Open
Copilot wants to merge 8 commits intomainfrom
copilot/remove-non-pipeline-exec-code
Open

[Refactor] Remove non-pipeline dead code from exec nodes#70133
Copilot wants to merge 8 commits intomainfrom
copilot/remove-non-pipeline-exec-code

Conversation

Copy link
Contributor

Copilot AI commented Mar 11, 2026

Non-pipeline prepare/open/get_next implementations in be/src/exec/*_node.cpp have been dead code since the pipeline engine became the sole execution path. This PR formalizes that by introducing a PipelineNode base class and removing the dead code tree.

Why I'm doing:

Non-pipeline prepare/open/get_next implementations in be/src/exec/*_node.cpp have been dead code since the pipeline engine became the sole execution path. Creating a PipelineNode base class formalizes this and removes the dead code.

What I'm doing:

Introduces PipelineNode (be/src/exec/pipeline_node.h / pipeline_node.cpp) as an abstract base class between ExecNode and all pipeline-only nodes. It provides default NotSupported implementations for prepare, open, and get_next. Migrates 19 node classes to extend PipelineNode and removes all dead non-pipeline helper functions and member variables (~1,580 lines removed).

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4
Original prompt

Goal

Remove legacy non-pipeline (exec-only) code paths from be/src/exec/*_node.cpp for prepare/open/get_next, keeping only the nodes that are still needed:

  • Reserve only *scan_node and *sink nodes.
  • For other exec nodes, remove/disable the non-pipeline execution engine code so they are effectively pipeline-only.

The request from the user: "remove no-pipeline code be/src/exec/*_node.cpp prepare/open/get_next. Note: only reserve *scan_node and *sink" and follow-up "exec-only".

Scope / Files

Primarily affects C++ backend execution nodes under:

  • be/src/exec/*_node.cpp

Likely includes (non-exhaustive):

  • be/src/exec/topn_node.cpp
  • be/src/exec/fetch_node.cpp
  • be/src/exec/lookup_node.cpp
  • be/src/exec/select_node.cpp
  • be/src/exec/repeat_node.cpp
  • be/src/exec/analytic_node.cpp
  • be/src/exec/exchange_node.cpp
  • be/src/exec/union_node.cpp
  • be/src/exec/except_node.cpp
  • be/src/exec/project_node.cpp
  • plus any other *_node.cpp that still contains exec-engine implementations in prepare/open/get_next.

Nodes to keep exec-only implementations (do not remove their non-pipeline prepare/open/get_next):

  • *scan_node* (e.g., olap_scan_node.cpp, file_scan_node.cpp, etc.)
  • *sink* nodes (note: these may not be in *_node.cpp, but preserve any relevant sink classes if present).

Expected Changes

  1. For non-scan and non-sink nodes in be/src/exec/*_node.cpp:

    • Remove or guard (compile-time or runtime) the non-pipeline logic in prepare(), open(), and get_next().
    • Ensure these methods either:
      • return Status::NotSupported("...") with a clear message like "non-pipeline execution is removed"; or
      • delegate to pipeline-only mechanisms (e.g., decompose_to_pipeline) and keep no exec-engine logic.
  2. Ensure build succeeds:

    • Update headers if needed (method declarations or overrides).
    • Remove now-unused includes.
    • Fix compilation errors due to removed members only used by exec-only path.
  3. Ensure runtime behavior is clear:

    • If any exec-only entrypoint can still call these methods, they should fail fast with explicit error messages.
  4. Add/adjust tests if the project has suitable unit/integration test coverage for plan execution selection.

Notes

  • The repository PR template is at .github/PULL_REQUEST_TEMPLATE.md and the eventual PR description must strictly follow it.
  • Keep pipeline engine code intact (e.g., decompose_to_pipeline and pipeline operators).

Acceptance Criteria

  • All non-scan/non-sink *_node.cpp files no longer contain functional exec-only implementations in prepare/open/get_next.
  • Scan nodes and sink-related code paths remain intact.
  • Code builds cleanly.
  • Any attempted use of removed exec-only paths returns a clear NotSupported/InternalError indicating non-pipeline execution has been removed.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: remove no-pipeline code be/src/exec/*_node.cpp prepare/open/get_next
Note: only reserve *scan_node and *sink

Assistant: I can do this, but I need one clarification before I start making code changes:

  • When you say “remove no-pipeline code be/src/exec/*_node.cpp prepare/open/get_next”, do you mean:
    1. Delete the non-pipeline implementations (prepare/open/get_next) from all *_node.cpp (except *scan_node* and *sink*) and replace them with something like return Status::NotSupported("non-pipeline ...");, while keeping pipeline decomposition (decompose_to_pipeline) working, or
    2. Remove non-pipeline branches inside those methods (e.g., if (!state->enable_pipeline_engine()) ...) but keep the methods otherwise functional?

From a quick scan, the repo doesn’t literally use the string "no-pipeline" in those files; it uses terms like “non-pipeline” (example: ConnectorScanNode comment “for non-pipeline APIs”). citeturn1search0

What I found so far (partial; GitHub search results are capped)

I located some *_node.cpp files under be/src/exec/ (this is not guaranteed complete due to search limits), including:

  • be/src/exec/file_scan_node.cpp (SCAN — should be preserved)
  • be/src/exec/topn_node.cpp
  • be/src/exec/exchange_node.cpp
  • be/src/exec/union_node.cpp
  • be/src/exec/select_node.cpp
  • be/src/exec/repeat_node.cpp
  • be/src/exec/fetch_node.cpp
  • be/src/exec/lookup_node.cpp

To view/expand the list in GitHub UI, use a repo search like:

repo:StarRocks/starrocks path:be/src/exec filename:_node.cpp

Scan/Sink exclusions you requested

  • Confirmed a scan node example: file_scan_node.cpp (inherits ScanNode). citeturn2search0
  • For sinks, note they are often under be/src/runtime/*sink*.cpp (e.g. result_sink.cpp) rather than be/src/exec/*_node.cpp. citeturn0search3
    Please confirm whether your “only reserve *sink” exception applies outside be/src/exec/*_node.cpp too (i.e., don’t touc...

This pull request was created from Copilot chat.


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Copilot AI and others added 2 commits March 11, 2026 09:26
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Copilot AI changed the title [WIP] Remove legacy non-pipeline code from execution nodes [Refactor] Remove non-pipeline exec-only code from be/src/exec/*_node.cpp prepare/open/get_next Mar 11, 2026
Copilot AI and others added 3 commits March 11, 2026 10:05
Remove stub prepare/open/get_next implementations and associated
dead helper functions and member variables from nodes that inherit
PipelineNode, which provides NotSupported defaults for these methods.

Files changed:
- hash_join_node: Remove stubs, _init_hash_table_param, and all
  dead helper methods (_has_null, _build, _probe, _probe_remain,
  _evaluate_build_keys, _calc_filter_for_other_conjunct,
  _process_*_conjunct, _push_down_in_filter, _do_publish_runtime_filters,
  _create_implicit_local_join_runtime_filters). Remove dead member
  variables and profiling counters.
- cross_join_node: Remove dead helper methods (_get_next_probe_chunk,
  6x _copy_* methods, get_next_internal, _init_row_desc, _build,
  _init_chunk) and associated member variables.
- topn_node: Remove stubs and _consume_chunks with _sort_timer.
- table_function_node: Remove stubs, build_chunk, get_next_input_chunk,
  and dead member variables (_fn_result_slots, _outer_slots, _param_slots,
  _input_chunk_ptr, _input_chunk_seek_rows, _outer_column_remain_repeat_times,
  _table_function_result, _table_function_exec_timer).
- repeat_node: Remove stubs, extend_and_update_columns, and timer members.
- select_node: Remove stubs and _conjunct_evaluate_timer.
- aggregate_streaming_node: Remove stubs and _output_chunk_from_hash_map.
- distinct_streaming_node: Remove stubs and _output_chunk_from_hash_set.
- intersect_node: Remove stubs and comment block.
- aggregate_base_node, aggregate_blocking_node, distinct_blocking_node:
  Remove stubs.
- lookup_node, raw_values_node: Remove stubs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Copilot AI changed the title [Refactor] Remove non-pipeline exec-only code from be/src/exec/*_node.cpp prepare/open/get_next [Refactor] Introduce PipelineNode base class; remove non-pipeline dead code from exec nodes Mar 11, 2026
@stdpain stdpain changed the title [Refactor] Introduce PipelineNode base class; remove non-pipeline dead code from exec nodes [Refactor] Remove non-pipeline dead code from exec nodes Mar 11, 2026
@stdpain stdpain marked this pull request as ready for review March 11, 2026 10:42
@stdpain stdpain requested a review from a team as a code owner March 11, 2026 10:42
@stdpain
Copy link
Contributor

stdpain commented Mar 11, 2026

@copilot The PR description must strictly follow the project's GitHub PR template.

Copy link
Contributor Author

Copilot AI commented Mar 11, 2026

@copilot The PR description must strictly follow the project's GitHub PR template.

Updated the PR description to follow the project template.

trueeyu
trueeyu previously approved these changes Mar 11, 2026
Signed-off-by: stdpain <drfeng08@gmail.com>
@github-actions
Copy link
Contributor

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Contributor

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Contributor

[BE Incremental Coverage Report]

fail : 13 / 21 (61.90%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/exec/empty_set_node.h 0 1 00.00% [43]
🔵 be/src/exec/pipeline_node.cpp 0 6 00.00% [19, 20, 23, 24, 27, 28]
🔵 be/src/exec/repeat_node.h 0 1 00.00% [27]
🔵 be/src/exec/topn_node.cpp 1 1 100.00% []
🔵 be/src/exec/empty_set_node.cpp 1 1 100.00% []
🔵 be/src/exec/intersect_node.cpp 1 1 100.00% []
🔵 be/src/exec/raw_values_node.cpp 1 1 100.00% []
🔵 be/src/exec/dict_decode_node.cpp 1 1 100.00% []
🔵 be/src/exec/cross_join_node.cpp 1 1 100.00% []
🔵 be/src/exec/lookup_node.cpp 1 1 100.00% []
🔵 be/src/exec/except_node.cpp 1 1 100.00% []
🔵 be/src/exec/aggregate/aggregate_base_node.cpp 1 1 100.00% []
🔵 be/src/exec/select_node.cpp 1 1 100.00% []
🔵 be/src/exec/table_function_node.cpp 1 1 100.00% []
🔵 be/src/exec/pipeline_node.h 2 2 100.00% []

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants