Skip to content

Conversation

@ask-kamal-nayan
Copy link

@ask-kamal-nayan ask-kamal-nayan commented May 19, 2025

Description

This PR modifies the runAfterRefreshWithPermit execution to run asynchronously using the thread pool scheduler. This change will prevent blocking the refresh thread during segment uploads to remote storage. Primary nodes will become searchable just after the refresh completes without waiting for segments to get uploaded to the remote storage.

Changes

  • Modified the runAfterRefreshWithPermit execution to run asynchronously using threadPool.schedule
  • The operation will be scheduled with zero delay to maintain immediate execution while being async
  • Uses the retry thread pool for scheduling the task
  • Updated the code to skip upload of SegmentN file upload to remote store.
  • Updated the snapshot restore code to use just segment metadata file while restoring and not using segmentN files as segmentN file is not getting uploaded anymore.

Check List

  • [ Yes] Functionality includes testing.
    • Tested by ingesting docs and checking if that remains searchable from primary as well as replica shards.
    • Tested by checking if the size of primary and replica becomes same after doc ingestion.
  • [ No] API changes companion pull request
    • Not applicable - This change is internal implementation only and doesn't affect public APIs
  • [ Yes] Public documentation issue/PR [created](ToDo: block fetch doc?)

Potential Risks

  • Monitor scenarios when multiple calls hit runAfterRefreshWithPermit within very short span of time.
  • Monitor the segment upload matrix to check if for any anomaly because of this change.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

❌ Gradle check result for d09228f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ashking94
Copy link
Member

❌ Gradle check result for d09228f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

I see 285 tests failure, can we fix them please?

@ashking94
Copy link
Member

Lets ensure that we are rerunning the failing tests for around 1k iterations locally to confirm once we have fixed them.

@github-actions
Copy link
Contributor

❌ Gradle check result for d09228f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for 73f116a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for 5c6edd1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for 15c0b0c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for f3c4ff5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

❌ Gradle check result for 4d4cb08: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 1b24166: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for bdd0e36: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 369e97a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 4424311: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Sep 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stalled Issues that have stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants