Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.x] Support dynamic node role #3585

Merged
merged 1 commit into from
Jun 15, 2022

Conversation

opensearch-trigger-bot[bot]
Copy link
Contributor

Backport e9c5ce3 from #3436

* Support unknown node role

Currently OpenSearch only supports several built-in nodes like data node
role. If specify unknown node role, OpenSearch node will fail to start.
This limit how to extend OpenSearch to support some extension function.
For example, user may prefer to run ML tasks on some dedicated node
which doesn't serve as any built-in node roles. So the ML tasks won't
impact OpenSearch core function. This PR removed the limitation and user
can specify any node role and OpenSearch will start node correctly with
that unknown role. This opens the door for plugin developer to run
specific tasks on dedicated nodes.

Issue: #2877

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix cat nodes rest API spec

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix mixed cluster IT failure

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* add DynamicRole

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* change generator method name

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix failed docker test

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* transform role name to lower case to avoid confusion

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* transform the node role abbreviation to lower case

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix checkstyle

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* add test for case-insensitive role name change

Signed-off-by: Yaliang Wu <ylwu@amazon.com>
(cherry picked from commit e9c5ce3)
@opensearch-trigger-bot opensearch-trigger-bot bot requested review from a team and reta as code owners June 14, 2022 22:49
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 614cd62
Log 6012

Reports 6012

@tlfeng
Copy link
Collaborator

tlfeng commented Jun 14, 2022

In log 6102:

> Task :plugins:repository-s3:yamlRestTest

REPRODUCE WITH: ./gradlew ':plugins:repository-s3:yamlRestTest' --tests "org.opensearch.repositories.s3.RepositoryS3ClientYamlTestSuiteIT.test {yaml=repository_s3/20_repository_permanent_credentials/Snapshot and Restore with repository-s3 using permanent credentials}" -Dtests.seed=654DF7AE6B584FC7 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ga -Dtests.timezone=Canada/Newfoundland -Druntime.java=17 -Dtests.rest.denylist=repository_s3/30_repository_temporary_credentials/*,repository_s3/40_repository_ec2_credentials/*,repository_s3/50_repository_ecs_credentials/*,repository_s3/60_repository_eks_credentials/*

org.opensearch.repositories.s3.RepositoryS3ClientYamlTestSuiteIT > test {yaml=repository_s3/20_repository_permanent_credentials/Snapshot and Restore with repository-s3 using permanent credentials} FAILED
    java.lang.AssertionError: Failure at [repository_s3/20_repository_permanent_credentials:250]: expected [2xx] status code but api [count] returned [503 Service Unavailable] [{"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[],"stack_trace":"Failed to execute phase [query], all shards failed\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:644)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:362)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:679)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:459)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:272)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:340)\n\tat org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)\n\tat org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59)\n\tat org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:798)\n\tat org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"},"status":503}]

@tlfeng tlfeng added backport PRs or issues specific to backporting features or enhancments v2.1.0 Issues and PRs related to version 2.1.0 labels Jun 14, 2022
@reta
Copy link
Collaborator

reta commented Jun 15, 2022

start gradle check

@reta
Copy link
Collaborator

reta commented Jun 15, 2022

In log 6102:

> Task :plugins:repository-s3:yamlRestTest

REPRODUCE WITH: ./gradlew ':plugins:repository-s3:yamlRestTest' --tests "org.opensearch.repositories.s3.RepositoryS3ClientYamlTestSuiteIT.test {yaml=repository_s3/20_repository_permanent_credentials/Snapshot and Restore with repository-s3 using permanent credentials}" -Dtests.seed=654DF7AE6B584FC7 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ga -Dtests.timezone=Canada/Newfoundland -Druntime.java=17 -Dtests.rest.denylist=repository_s3/30_repository_temporary_credentials/*,repository_s3/40_repository_ec2_credentials/*,repository_s3/50_repository_ecs_credentials/*,repository_s3/60_repository_eks_credentials/*

org.opensearch.repositories.s3.RepositoryS3ClientYamlTestSuiteIT > test {yaml=repository_s3/20_repository_permanent_credentials/Snapshot and Restore with repository-s3 using permanent credentials} FAILED
    java.lang.AssertionError: Failure at [repository_s3/20_repository_permanent_credentials:250]: expected [2xx] status code but api [count] returned [503 Service Unavailable] [{"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[],"stack_trace":"Failed to execute phase [query], all shards failed\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:644)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:362)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:679)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:459)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:272)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:340)\n\tat org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)\n\tat org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59)\n\tat org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:798)\n\tat org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"},"status":503}]

Thanks @tlfeng , I will check it, the 60_repository_eks_credentials test suite was added recently

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 614cd62
Log 6017

Reports 6017

@kotwanikunal kotwanikunal merged commit 00fb2b7 into 2.x Jun 15, 2022
@github-actions github-actions bot deleted the backport/backport-3436-to-2.x branch June 15, 2022 05:41
@reta
Copy link
Collaborator

reta commented Jun 15, 2022

In log 6102:

> Task :plugins:repository-s3:yamlRestTest

REPRODUCE WITH: ./gradlew ':plugins:repository-s3:yamlRestTest' --tests "org.opensearch.repositories.s3.RepositoryS3ClientYamlTestSuiteIT.test {yaml=repository_s3/20_repository_permanent_credentials/Snapshot and Restore with repository-s3 using permanent credentials}" -Dtests.seed=654DF7AE6B584FC7 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ga -Dtests.timezone=Canada/Newfoundland -Druntime.java=17 -Dtests.rest.denylist=repository_s3/30_repository_temporary_credentials/*,repository_s3/40_repository_ec2_credentials/*,repository_s3/50_repository_ecs_credentials/*,repository_s3/60_repository_eks_credentials/*

org.opensearch.repositories.s3.RepositoryS3ClientYamlTestSuiteIT > test {yaml=repository_s3/20_repository_permanent_credentials/Snapshot and Restore with repository-s3 using permanent credentials} FAILED
    java.lang.AssertionError: Failure at [repository_s3/20_repository_permanent_credentials:250]: expected [2xx] status code but api [count] returned [503 Service Unavailable] [{"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[],"stack_trace":"Failed to execute phase [query], all shards failed\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:644)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:362)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:679)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:459)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:272)\n\tat org.opensearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:340)\n\tat org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)\n\tat org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59)\n\tat org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:798)\n\tat org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"},"status":503}]

the 60_repository_eks_credentials test suite was added recently - seems to be indeed just random failure, it was not a new test suite but an existing one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport PRs or issues specific to backporting features or enhancments v2.1.0 Issues and PRs related to version 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants