Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh OpenSearch nodes version in cluster state after upgrade #865

Conversation

shwetathareja
Copy link
Member

@shwetathareja shwetathareja commented Jun 21, 2021

Description

PR for fixing the issue #827

Fix:

  1. As soon as OpenSearch node becomes master, it refreshes the nodes version in cluster state by comparing against their channel version. Right now, it doesn't wait for all Elasticsearch nodes to leave the cluster before refreshing the version.
  2. There was bug in DiscoveryNode BWC serialization logic where any ES version < 7.10.2 was getting over-written with 7.10.2

Issues Resolved

#827

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@opensearch-ci-bot
Copy link
Collaborator

❌   DCO Check Failed 1df4237eb82a3374ba42478347b9bcb78cff9b05
Run ./dev-tools/signoff-check.sh remotes/origin/main 1df4237eb82a3374ba42478347b9bcb78cff9b05 to check locally
Use git commit with -s to add 'Signed-of-by: {EMAIL}' on impacted commits

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success 1df4237eb82a3374ba42478347b9bcb78cff9b05

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Precommit failure 1df4237eb82a3374ba42478347b9bcb78cff9b05
Log 625

@dblock
Copy link
Member

dblock commented Jun 21, 2021

I like it.

@nknize nknize added bug Something isn't working v1.0.0 Version 1.0.0 rolling-upgrade Issues related to rolling upgrades labels Jun 24, 2021
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the approach so far. Build fails due to simple compile issue. Will review further after integration and unit tests are added and we can merge in a 1.0.0 patch release since this will likely not make it in time for GA.

@nknize
Copy link
Collaborator

nknize commented Jun 28, 2021

@shwetathareja Any update on getting Integration and Unit tests done? Do you think you could get this in for GA? If not we will need to get a bugfix release out pretty quickly after GA.

@CEHENKLE
Copy link
Member

@shwetathareja Hey -- we're running out of runway for 1.0.0 and this seems like something we'd want to get in. Can you get your tests in?

@shwetathareja
Copy link
Member Author

@nknize @CEHENKLE sorry for the delay, working on it, will raise it in next couple of hours.

@opensearch-ci-bot
Copy link
Collaborator

❌   DCO Check Failed 54ef87927cffb5a70edd52d2fc4de97e6d928a9d
Run ./dev-tools/signoff-check.sh remotes/origin/main 54ef87927cffb5a70edd52d2fc4de97e6d928a9d to check locally
Use git commit with -s to add 'Signed-of-by: {EMAIL}' on impacted commits

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success 54ef87927cffb5a70edd52d2fc4de97e6d928a9d

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Precommit failure 54ef87927cffb5a70edd52d2fc4de97e6d928a9d
Log 691

@CEHENKLE
Copy link
Member

CEHENKLE commented Jul 1, 2021

Thanks, @shwetathareja ! I'll look for it in our AM :)

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success 12717a5455f85e062276ae861270155a078a7ea6

@opensearch-ci-bot
Copy link
Collaborator

❌   DCO Check Failed 12717a5455f85e062276ae861270155a078a7ea6
Run ./dev-tools/signoff-check.sh remotes/origin/main 12717a5455f85e062276ae861270155a078a7ea6 to check locally
Use git commit with -s to add 'Signed-of-by: {EMAIL}' on impacted commits

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success 191133e0770b7629b263687d96b0ced3a6d72a0a

@opensearch-ci-bot
Copy link
Collaborator

❌   DCO Check Failed 191133e0770b7629b263687d96b0ced3a6d72a0a
Run ./dev-tools/signoff-check.sh remotes/origin/main 191133e0770b7629b263687d96b0ced3a6d72a0a to check locally
Use git commit with -s to add 'Signed-of-by: {EMAIL}' on impacted commits

@shwetathareja shwetathareja marked this pull request as ready for review July 1, 2021 10:34
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Precommit failure 12717a5455f85e062276ae861270155a078a7ea6
Log 692

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Precommit failure 191133e0770b7629b263687d96b0ced3a6d72a0a
Log 693

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success 34e53070801c94aab59ba7e93262e398167f0ee2

@opensearch-ci-bot
Copy link
Collaborator

❌   DCO Check Failed 34e53070801c94aab59ba7e93262e398167f0ee2
Run ./dev-tools/signoff-check.sh remotes/origin/main 34e53070801c94aab59ba7e93262e398167f0ee2 to check locally
Use git commit with -s to add 'Signed-of-by: {EMAIL}' on impacted commits

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Precommit success 34e53070801c94aab59ba7e93262e398167f0ee2

@shwetathareja
Copy link
Member Author

@nknize @adnapibar @dblock please review.

@shwetathareja
Copy link
Member Author

start gradle check

1 similar comment
@adnapibar
Copy link
Contributor

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 34e53070801c94aab59ba7e93262e398167f0ee2
Log 287

Reports 287

@adnapibar
Copy link
Contributor

The changes look good but I see the a few bwc tests for mixed cluster are failing (these pass in main)

./gradlew :qa:mixed-cluster:v7.10.2#mixedClusterTest

Tests with failures:
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.health/10_basic/Empty cluster}
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.health/10_basic/Help}
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.health/10_basic/With ts parameter}

@adnapibar
Copy link
Contributor

adnapibar commented Jul 1, 2021

The changes look good but I see the a few bwc tests for mixed cluster are failing (these pass in main)

./gradlew :qa:mixed-cluster:v7.10.2#mixedClusterTest

Tests with failures:
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.health/10_basic/Empty cluster}
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.health/10_basic/Help}
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.health/10_basic/With ts parameter}

Looks like the BWC tests have become flaky and I can also see these in the main branch after a rerun. I have created an issue to investigate and resolve these separately #922

Copy link
Contributor

@adnapibar adnapibar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - couple of minor questions.
Also please fix the DCO check.

@@ -367,7 +367,7 @@ public void writeTo(StreamOutput out) throws IOException {
}
}
}
if (out.getVersion().before(Version.V_1_0_0)) {
if (out.getVersion().before(Version.V_1_0_0) && version.onOrAfter(Version.V_1_0_0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please help me understand why we're adding this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, DiscoveryNode writeTo serialization method can be triggered from multiple places e.g. when cluster state is being sent from one node to another and BWC version should be set for only OpenSearch nodes. Without this check, it will mess with the version for Elasticsearch nodes in the cluster as well which ideally should be untouched.

// there is an edge case if doesn't send JoinRequest and connection is established,
// then it can continue to report version as 7.10.2 instead of actual OpenSearch version. So,
// removing the node from cluster state to prevent stale version reporting and let it reconnect.
if (node.getVersion().equals(LegacyESVersion.V_7_10_2)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it not be any other legacy version e.g. 6.8.x or 7.10.x which we are upgrading from?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only concerned about the OpenSearch nodes in BWC mode, any other ES node that is not yet upgraded should run in mixed cluster - is that correct understanding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other ES versions, the version in cluster state and channel version would always be same. So, there is no need for special handling. Its only for OpenSearch nodes which spoof to be 7.10.2 and we don't know if it is real 7.10.2 or OpenSearch node spoofing its version.

@CEHENKLE
Copy link
Member

CEHENKLE commented Jul 2, 2021

@shwetathareja Heya Shweta -- we're holding the build for this fix, so if you could fix your DCO, we're going slam the door on 1.0.0 behind you. So if you can fix that and merge when you get in, we should be good to go in our AM :)

Thanks,
/C

Signed-off-by: Shweta Thareja <tharejas@amazon.com>
@shwetathareja shwetathareja force-pushed the version-fix-cat-apis-after-upgrade branch from 34e5307 to c75f5e3 Compare July 2, 2021 03:48
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success c75f5e3

@opensearch-ci-bot
Copy link
Collaborator

✅   DCO Check Passed c75f5e3

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Precommit success c75f5e3

@adnapibar
Copy link
Contributor

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success c75f5e3
Log 298

Reports 298

@CEHENKLE CEHENKLE merged commit 8082604 into opensearch-project:main Jul 2, 2021
tlfeng pushed a commit to tlfeng/OpenSearch that referenced this pull request Jul 2, 2021
…search-project#865)

Signed-off-by: Shweta Thareja <tharejas@amazon.com>

Co-authored-by: Shweta Thareja <tharejas@amazon.com>
adnapibar pushed a commit that referenced this pull request Jul 2, 2021
#926)

Signed-off-by: Shweta Thareja <tharejas@amazon.com>
Co-authored-by: shwetathareja <shwetathareja@live.com>
adnapibar pushed a commit that referenced this pull request Jul 2, 2021
#926)

Signed-off-by: Shweta Thareja <tharejas@amazon.com>
Co-authored-by: shwetathareja <shwetathareja@live.com>
@shwetathareja shwetathareja deleted the version-fix-cat-apis-after-upgrade branch July 11, 2021 09:54
dblock pushed a commit that referenced this pull request Nov 14, 2022
This change removes code that was written to support rolling upgrade from Elasticsearch 7.10.2 to OpenSearch 1.x, as part of PR #865

Signed-off-by: Rabi Panda <adnapibar@gmail.com>

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
ritty27 pushed a commit to ritty27/OpenSearch that referenced this pull request May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rolling-upgrade Issues related to rolling upgrades v1.0.0 Version 1.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants