Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Add new segrep settings #6523

Merged

Conversation

dreamer-89
Copy link
Member

@dreamer-89 dreamer-89 commented Mar 1, 2023

Description

Updates PrimaryShardReplicationSource to use 30mins which is the segrep activity running time. It replaces existing hard-coded constant defining per minute bytes that can be processed by a host (default calculated for m5 machine). The other option is to expose this value as a dynmaic setting but then setting default is challenging and set the onus to end-user. For simplification, used max available timeout value for fetching the files.

This change introduces two new settings for SegmentReplication inside RecoverySettings, keeping existing defaults.
1. Use new dynamic setting for max bytes, source can process per minute. This is used in PrimaryShardReplicationSource to set the transport requst timeout for getting segment files from source. Using a setting is beneficial vs a hard-code constant as it alllows user to choose the right value based on hardware type.
2. Use new dynamic segment replication activity timeout setting. When an replication activity is idle for this time period, ReplicationMonitor removes the stale replication and fails the shard. The setting has existing default value of 30mins.

Issues Resolved

#6027

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions

This comment was marked as outdated.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 1, 2023

Gradle Check (Jenkins) Run Completed with:

@mch2
Copy link
Member

mch2 commented Mar 1, 2023

@dreamer-89 After thinking on this more, do we really need new settings here? I think we may be able to simplify this by reusing the Recovery setting INDICES_RECOVERY_INTERNAL_LONG_ACTION_TIMEOUT_SETTING that sets timeout to 30m by default.

SEGREP_MAX_BYTES_PROCESSED_PER_MINUTE_SETTING - This is a bit odd to me as it will depend on the hardware a user has, which is difficult to set a default. I know this was intended to scale the timeout with the amount of bytes being sent, but I think simplifying with a single hard timeout is easier to reason about.

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 1, 2023

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #6523 (08ce58b) into main (fa8937b) will increase coverage by 0.07%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #6523      +/-   ##
============================================
+ Coverage     70.67%   70.74%   +0.07%     
- Complexity    59047    59058      +11     
============================================
  Files          4802     4802              
  Lines        282970   282973       +3     
  Branches      40793    40793              
============================================
+ Hits         199982   200185     +203     
+ Misses        66565    66340     -225     
- Partials      16423    16448      +25     
Impacted Files Coverage Δ
...ces/replication/PrimaryShardReplicationSource.java 96.00% <100.00%> (-0.30%) ⬇️
.../opensearch/test/transport/CapturingTransport.java 100.00% <100.00%> (ø)
...a/org/opensearch/test/transport/MockTransport.java 86.79% <100.00%> (+0.51%) ⬆️
...n/indices/forcemerge/ForceMergeRequestBuilder.java 0.00% <0.00%> (-75.00%) ⬇️
...h/action/ingest/SimulateDocumentVerboseResult.java 60.71% <0.00%> (-39.29%) ⬇️
...cluster/coordination/PublishClusterStateStats.java 33.33% <0.00%> (-37.51%) ⬇️
.../indices/forcemerge/TransportForceMergeAction.java 25.00% <0.00%> (-33.34%) ⬇️
...regations/metrics/AbstractHyperLogLogPlusPlus.java 63.79% <0.00%> (-32.76%) ⬇️
...rc/main/java/org/opensearch/ingest/IngestInfo.java 51.72% <0.00%> (-27.59%) ⬇️
...ndex/seqno/RetentionLeaseBackgroundSyncAction.java 37.50% <0.00%> (-25.00%) ⬇️
... and 503 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@dreamer-89 dreamer-89 merged commit 7b96c92 into opensearch-project:main Mar 2, 2023
@dreamer-89 dreamer-89 added the backport 2.x Backport to 2.x branch label Mar 2, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6523-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 7b96c92522f6cb798e8f02b6a5ff796bb6078a70
# Push it to GitHub
git push --set-upstream origin backport/backport-6523-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-6523-to-2.x.

dreamer-89 added a commit to dreamer-89/OpenSearch that referenced this pull request Mar 2, 2023
* [Segment Replication] Add segrep settings

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

---------

Signed-off-by: Suraj Singh <surajrider@gmail.com>
dreamer-89 added a commit to dreamer-89/OpenSearch that referenced this pull request Mar 2, 2023
* [Segment Replication] Add segrep settings

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

---------

Signed-off-by: Suraj Singh <surajrider@gmail.com>
dreamer-89 added a commit that referenced this pull request Mar 2, 2023
* [Segment Replication] Add segrep settings



* Spotless fix



* Address review comments



---------

Signed-off-by: Suraj Singh <surajrider@gmail.com>
mingshl pushed a commit to mingshl/OpenSearch-Mingshl that referenced this pull request Mar 24, 2023
* [Segment Replication] Add segrep settings

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

---------

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants