Skip to content

Pipe/IoTV2: Persist progress index locally before shutdown to accurate recovery after restart #15779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 19, 2025

Conversation

SteveYurongSu
Copy link
Member

No description provided.

@SteveYurongSu SteveYurongSu requested review from Copilot, Pengzna and Caideyipi and removed request for Copilot June 18, 2025 11:03
@SteveYurongSu SteveYurongSu self-assigned this Jun 18, 2025
@SteveYurongSu SteveYurongSu requested a review from Copilot June 18, 2025 11:03
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds local persistence of pipe progress indexes during shutdown to enable accurate recovery after restarts.

  • Exposes and propagates persistProgressIndex from task to runtime level
  • Hooks into the shutdown sequence to trigger persistence via the task agent
  • Implements persistAllProgressIndexLocally with locking and logging in the task agent

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
PipeTaskMeta.java Changed persistProgressIndex visibility from private to public to allow external calls
PipeRuntimeMeta.java Added persistProgressIndex to iterate over all task metas and persist progress
DataNodeShutdownHook.java Calls persistAllProgressIndexLocally before shutting down the node
PipeDataNodeTaskAgent.java Introduced persistAllProgressIndexLocally with a read lock and informational logging
Comments suppressed due to low confidence (5)

iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/agent/task/PipeDataNodeTaskAgent.java:847

  • [nitpick] Consider pluralizing 'progress index' to 'progress indexes' and rephrasing to 'Failed to persist progress indexes locally due to timeout.' for clarity.
      LOGGER.info("Failed to persist all progress index locally because of timeout.");

iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/agent/task/PipeDataNodeTaskAgent.java:854

  • [nitpick] Rephrase to 'Successfully persisted progress indexes locally.' to match plural usage and improve readability.
      LOGGER.info("Persist all progress index locally successfully.");

iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/pipe/agent/task/meta/PipeRuntimeMeta.java:143

  • Add a JavaDoc comment explaining the purpose and thread-safety guarantees of this public method.
  public void persistProgressIndex() {

iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/agent/task/PipeDataNodeTaskAgent.java:845

  • Provide JavaDoc for this shutdown method describing its behavior, locking semantics, and any exceptions it may throw.
  public void persistAllProgressIndexLocally() {

iotdb-core/datanode/src/main/java/org/apache/iotdb/db/service/DataNodeShutdownHook.java:121

  • Fix grammar in the comment: change 'to accurate recovery' to 'to ensure accurate recovery' for clarity.
    // Persist progress index before shutdown to accurate recovery after restart

Copy link

Copy link

codecov bot commented Jun 18, 2025

Codecov Report

Attention: Patch coverage is 0% with 17 lines in your changes missing coverage. Please review.

Project coverage is 39.11%. Comparing base (bbe031a) to head (766c8c1).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...otdb/db/pipe/agent/task/PipeDataNodeTaskAgent.java 0.00% 11 Missing ⚠️
.../commons/pipe/agent/task/meta/PipeRuntimeMeta.java 0.00% 5 Missing ⚠️
.../apache/iotdb/db/service/DataNodeShutdownHook.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15779      +/-   ##
============================================
- Coverage     39.12%   39.11%   -0.01%     
  Complexity      198      198              
============================================
  Files          4837     4837              
  Lines        314486   314506      +20     
  Branches      39449    39452       +3     
============================================
- Hits         123033   123021      -12     
- Misses       191453   191485      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@SteveYurongSu SteveYurongSu merged commit e227a53 into master Jun 19, 2025
54 of 57 checks passed
@SteveYurongSu SteveYurongSu deleted the record-progress-when-shutdown branch June 19, 2025 04:05
Caideyipi pushed a commit to Caideyipi/iotdb that referenced this pull request Jun 26, 2025
Caideyipi added a commit that referenced this pull request Jun 26, 2025
…e load conversion in parallel commit & Avoid lock acquisition in pipe task agent for too long & Used cached thread pool + caller runs policy for schema parallel task & Removed WAL usage for pipe and added insertNode mem-ctrl & Lengthened the start pipe reload interval & Enabled waiting to send + progressIndex persist at shutdown hook (#15826)

* partial

Update PipeDataNodeRuntimeAgent.java

Update PipeTerminateEvent.java

Update PipeTerminateEvent.java

partially

Update AbstractOperatePipeProcedureV2.java

partial

Update PipeTaskAgent.java

Update DataNodeInternalRPCServiceImpl.java

schema pool

PR Revert

report-interval

* wal deletion

* continue deletion

* Update CheckpointManager.java

* Release memory

* Parital

* update PipeInsertNodeTabletInsertionEvent

* Remove useless

* Update PipeRealtimeDataRegionHybridExtractor.java

* Update Metric.java

* setup

* Fix

* modify pipeStuckRestartMinIntervalMs

* Pipe/IoTV2: Persist progress index locally before shutdown to accurate recovery after restart (#15779)

* add MaxWaitFinishTime

* update PipeConfig

* Fix

* Update WalDeleteOutdatedNewTest.java

* Update PipeMemoryManager.java

* Update CommonConfig.java

---------

Co-authored-by: luoluoyuyu <zhenyu@apache.org>
Co-authored-by: VGalaxies <vgalaxies@apache.org>
Co-authored-by: Steve Yurong Su <rong@apache.org>
SteveYurongSu added a commit that referenced this pull request Jul 8, 2025
…e recovery after restart (#15779)

(cherry picked from commit e227a53)
Caideyipi pushed a commit that referenced this pull request Jul 9, 2025
…e recovery after restart (#15779) (#15887)

(cherry picked from commit e227a53)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants