Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable merge policy and option to choose between LogByteSize and Tiered MergePolicy #9992

Merged
merged 9 commits into from
Oct 2, 2023

Conversation

rishabhmaurya
Copy link
Contributor

@rishabhmaurya rishabhmaurya commented Sep 12, 2023

Description

#9241
currently due to lack of appropriate benchmark, we don't have a promising numbers to show the benefits. Thus keeping tiered merge policy as default until we have enough confidence on the change to make it as a default for data streams.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 12, 2023

Compatibility status:

Checks if related components are compatible with change 7caa621

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Sep 12, 2023

Codecov Report

Merging #9992 (13fc7cc) into main (e156582) will decrease coverage by 0.07%.
Report is 1 commits behind head on main.
The diff coverage is 82.53%.

❗ Current head 13fc7cc differs from pull request most recent head 7caa621. Consider uploading reports for the commit 7caa621 to get more accurate results

@@             Coverage Diff              @@
##               main    #9992      +/-   ##
============================================
- Coverage     71.18%   71.12%   -0.07%     
+ Complexity    58241    58220      -21     
============================================
  Files          4830     4831       +1     
  Lines        274592   274598       +6     
  Branches      40020    40008      -12     
============================================
- Hits         195469   195296     -173     
- Misses        62701    62959     +258     
+ Partials      16422    16343      -79     
Files Coverage Δ
...rg/opensearch/common/settings/ClusterSettings.java 92.85% <ø> (ø)
...pensearch/common/settings/IndexScopedSettings.java 100.00% <ø> (ø)
...in/java/org/opensearch/index/shard/IndexShard.java 69.92% <100.00%> (+0.40%) ⬆️
.../main/java/org/opensearch/index/IndexSettings.java 86.78% <89.83%> (+0.36%) ⬆️
...rg/opensearch/index/TieredMergePolicyProvider.java 73.68% <70.83%> (ø)
...ensearch/index/LogByteSizeMergePolicyProvider.java 78.57% <78.57%> (ø)

... and 453 files with indirect coverage changes

@rishabhmaurya rishabhmaurya force-pushed the rishma-logmerge-temp branch 2 times, most recently from bd49090 to 3fa195a Compare September 13, 2023 17:37
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
@reta
Copy link
Collaborator

reta commented Oct 2, 2023

@msfroh good to go?

@github-actions
Copy link
Contributor

github-actions bot commented Oct 2, 2023

Gradle Check (Jenkins) Run Completed with:

@msfroh msfroh merged commit fa66beb into opensearch-project:main Oct 2, 2023
@reta reta added the backport 2.x Backport to 2.x branch label Oct 2, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-9992-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 fa66bebf88d6fc69765d5ecb26930ece94477024
# Push it to GitHub
git push --set-upstream origin backport/backport-9992-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-9992-to-2.x.

@rishabhmaurya
Copy link
Contributor Author

thanks @reta and @msfroh. Working on backport manually

rishabhmaurya added a commit to rishabhmaurya/OpenSearch that referenced this pull request Oct 2, 2023
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
(cherry picked from commit fa66beb)
@rishabhmaurya
Copy link
Contributor Author

2.x backport PR - #10312

@rishabhmaurya rishabhmaurya changed the title Use of LogByteSizeMergePolicy for data stream use cases Configurable merge policy and option to choose between LogByteSize and Tiered MergePolicy Oct 2, 2023
@reta
Copy link
Collaborator

reta commented Oct 3, 2023

@rishabhmaurya please create a documentation issue / section for this new setting(s) here , github.com/Opensearch-project/documentation-website/, we'll need it for 2.11 release, thank you.

@rishabhmaurya
Copy link
Contributor Author

@reta I have the issue opened for documentation opensearch-project/documentation-website#5119 thank you.

rishabhmaurya added a commit to rishabhmaurya/OpenSearch that referenced this pull request Oct 3, 2023
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
(cherry picked from commit fa66beb)
rishabhmaurya added a commit to rishabhmaurya/OpenSearch that referenced this pull request Oct 3, 2023
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
(cherry picked from commit fa66beb)
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Oct 3, 2023
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
reta pushed a commit that referenced this pull request Oct 3, 2023
* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
(cherry picked from commit fa66beb)
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 9, 2023
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…roject#9992)

* Configurable merge policy for index

* additional setting to configure merge policy for timestamp based index
* introduction of logbytesize merge policy as an option

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* remove the trace log not required anymore

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Refactor the merge policy extraction logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Rename constant DEFAULT to DEFAULT_POLICY

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Simplify merge policy extraction and selection logic

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* missing javadoc error

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Renaming log byte size policy setting with mb

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* Move validation exception to enum from setting defn

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

* rename time_index to time_series_index

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>

---------

Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants