Skip to content

Conversation

GrigoriyPA
Copy link
Collaborator

Changelog entry

Support passing checkpoint on streaming query manual restart (before it checkpoint always was lost when called ALTER STREAMING QUERY)

Changelog category

  • Improvement

Description for reviewers

  • Also fixed SLJ checkpoint waiting

@GrigoriyPA GrigoriyPA requested review from a team as code owners October 10, 2025 15:49
Copy link

🟢 2025-10-10 15:49:29 UTC The validation of the Pull Request description is successful.

Copy link

github-actions bot commented Oct 10, 2025

2025-10-10 15:50:29 UTC Pre-commit check linux-x86_64-release-asan for 2205664 has started.
2025-10-10 15:50:32 UTC Artifacts will be uploaded here
2025-10-10 15:54:36 UTC ya make is running...
🔴 2025-10-10 16:46:59 UTC Build failed, see the logs. Also see fail summary
🟡 2025-10-10 16:47:16 UTC ydbd size 3.7 GiB changed* by +1.1 MiB, which is >= 100.0 KiB vs main: Warning

ydbd size dash main: a446a72 merge: 2205664 diff diff %
ydbd size 4 023 586 624 Bytes 4 024 728 360 Bytes +1.1 MiB +0.028%
ydbd stripped size 1 494 478 880 Bytes 1 494 946 176 Bytes +456.3 KiB +0.031%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for preserving checkpoints when streaming queries are manually restarted via ALTER STREAMING QUERY. Previously, checkpoints were always lost when the query was restarted, but now the checkpoint state can be maintained across restarts.

Key changes include:

  • Enhanced checkpoint management to support passing checkpoint IDs through query restarts
  • Fixed checkpoint waiting logic for Stream Lookup Join (SLJ) operations
  • Added comprehensive test coverage for checkpoint recovery scenarios

Reviewed Changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
ydb/library/yql/providers/pq/task_meta/task_meta.h Added new function declaration and namespace restructuring
ydb/library/yql/providers/pq/task_meta/task_meta.cpp Implemented new function to get topic partitions sets from DQ tasks
ydb/library/yql/providers/generic/connector/libcpp/ut_helpers/connector_client_mock.h Added reading lock/unlock functionality for testing
ydb/library/yql/dq/tasks/dq_tasks_graph.h Added comments explaining checkpoint injection logic
ydb/library/yql/dq/state/dq_state_load_plan.cpp Updated to handle multiple partition sets instead of single set
ydb/library/yql/dq/actors/compute/dq_sync_compute_actor_base.h Enhanced checkpoint readiness check to include sources and transforms
ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h Removed unnecessary parentheses in conditional
ydb/library/yql/dq/actors/compute/dq_compute_actor_checkpoints.cpp Added comments explaining checkpoint injection logic
ydb/core/protos/kqp.proto Added CheckpointId and QueryTextRevision fields to protobuf messages
ydb/core/kqp/ut/federated_query/datastreams/datastreams_ut.cpp Added comprehensive test cases for checkpoint recovery scenarios
ydb/core/kqp/run_script_actor/ya.make Added dependency for checkpointing events
ydb/core/kqp/run_script_actor/kqp_run_script_actor.h Added CheckpointId field to settings
ydb/core/kqp/run_script_actor/kqp_run_script_actor.cpp Implemented checkpoint handling logic in run script actor
ydb/core/kqp/proxy_service/kqp_script_executions.h Updated function signature for saving physical graph
ydb/core/kqp/proxy_service/kqp_script_executions.cpp Enhanced script execution with checkpoint and generation handling
ydb/core/kqp/gateway/behaviour/streaming_query/ya.make Added proxy service dependency
ydb/core/kqp/gateway/behaviour/streaming_query/queries.cpp Implemented checkpoint recovery logic for streaming queries
ydb/core/kqp/gateway/behaviour/streaming_query/common/utils.h Added QueryTextRevision field to streaming query settings
ydb/core/kqp/gateway/behaviour/streaming_query/common/utils.cpp Implemented parsing for QueryTextRevision
ydb/core/kqp/executer_actor/kqp_data_executer.cpp Enhanced checkpoint coordinator with proper state loading and checkpoint ID handling
ydb/core/kqp/common/kqp_user_request_context.h Added CheckpointId field to user request context
ydb/core/kqp/common/kqp_user_request_context.cpp Implemented CheckpointId serialization in user context
ydb/core/kqp/common/events/script_executions.h Updated event signatures to support generation tracking
ydb/core/kqp/common/events/events.h Added Generation and CheckpointId fields to script events
ydb/core/fq/libs/checkpointing/checkpoint_coordinator.cpp Added logging for checkpoint restoration process

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

github-actions bot commented Oct 10, 2025

2025-10-10 15:58:16 UTC Pre-commit check linux-x86_64-relwithdebinfo for 2205664 has started.
2025-10-10 15:58:31 UTC Artifacts will be uploaded here
2025-10-10 16:02:46 UTC ya make is running...
🔴 2025-10-10 16:56:18 UTC Build failed, see the logs. Also see fail summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants