Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Awareness health api rest and transport layer changes #5694

Merged
merged 21 commits into from
Jan 10, 2023

Conversation

nishchay21
Copy link
Contributor

Description

Adding rest and transport level changes for below approved PR #5231

Issues Resolved

#5230.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Jan 4, 2023

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

codecov-commenter commented Jan 4, 2023

Codecov Report

Merging #5694 (eaede6b) into main (28e9b11) will increase coverage by 0.05%.
The diff coverage is 44.76%.

@@             Coverage Diff              @@
##               main    #5694      +/-   ##
============================================
+ Coverage     70.98%   71.04%   +0.05%     
- Complexity    58669    58769     +100     
============================================
  Files          4763     4771       +8     
  Lines        279945   280114     +169     
  Branches      40418    40425       +7     
============================================
+ Hits         198731   199014     +283     
+ Misses        64988    64842     -146     
- Partials      16226    16258      +32     
Impacted Files Coverage Δ
...awarenesshealth/ClusterAwarenessHealthRequest.java 0.00% <0.00%> (ø)
...sshealth/ClusterAwarenessHealthRequestBuilder.java 0.00% <0.00%> (ø)
.../src/main/java/org/opensearch/client/Requests.java 46.42% <0.00%> (-0.85%) ⬇️
.../org/opensearch/client/support/AbstractClient.java 32.62% <0.00%> (+0.08%) ⬆️
...earch/cluster/block/IndexCreateBlockException.java 0.00% <0.00%> (ø)
...rg/opensearch/common/settings/ClusterSettings.java 91.89% <ø> (ø)
...a/org/opensearch/test/OpenSearchIntegTestCase.java 56.46% <0.00%> (-0.21%) ⬇️
...action/admin/cluster/settings/SettingsUpdater.java 91.17% <20.00%> (-5.65%) ⬇️
...shealth/TransportClusterAwarenessHealthAction.java 26.31% <26.31%> (ø)
...dmin/cluster/RestClusterAwarenessHealthAction.java 27.27% <27.27%> (ø)
... and 513 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@gbbafna gbbafna requested a review from imRishN January 4, 2023 09:39
{
"cluster.awareness_attribute_health": {
"documentation": {
"url": "https://opensearch.org/docs/latest/opensearch/rest-api/awareness_attribute_health/",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to open a documentation issue for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"url": "https://opensearch.org/docs/latest/opensearch/rest-api/awareness_attribute_health/",
"description": "Returns basic information about the health of the cluster aggregated per awareness attribute."
},
"stability": "experimental",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it experimental? I don't see API is behind feature flag. Is there any more work pending on this action?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is discussed with gaurav as well . We will have this experimental and then GA later

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets have a issue open to mark it stable later or the meta issues we are tracking?

Comment on lines 236 to 239
@Override
public RestStatus status() {
return RestStatus.OK;
}
Copy link
Member

@imRishN imRishN Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default behaviour anyways, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then lets remove this override?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interface that is getting implemented we need to override the method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClusterHealthResponse also has functionality for timeouts -

@Override
    public RestStatus status() {
        return isTimedOut() ? RestStatus.REQUEST_TIMEOUT : RestStatus.OK;
    }

Do we need something similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add that later if required. For now not a necessary parameter for API usage

ClusterAwarenessHealthRequest::new,
indexNameExpressionResolver
);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra line

ClusterState state,
ActionListener<ClusterAwarenessHealthResponse> listener
) {
ClusterState currentState = clusterService.state();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state is already present in the method param

ActionListener<ClusterAwarenessHealthResponse> listener
) {
ClusterState currentState = clusterService.state();
ClusterSettings currentSettings = clusterService.getClusterSettings();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we inject it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster settings are required from cluster service which is already injected

Comment on lines 53 to 55
if (awarenessAttributeName == null || awarenessAttributeName.isBlank()) {
return addValidationError("awareness attribute value seems to be missing", null);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check if the requested attribute is actually set in the cluster settings for awareness attribute? I believe it might be already handled in the service layer, but we can fail fast here

Copy link
Contributor Author

@nishchay21 nishchay21 Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct that is handled in the service layer fail fast wont be necessary as other output which is cluster level health will be seen.

Copy link
Collaborator

@gbbafna gbbafna Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw an exception from here. This should be a mandatory param . Ignore, since we are already doing that.

Comment on lines 203 to 223
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject();
builder.field(CLUSTER_NAME, getClusterName());
builder.field(STATUS, getStatus().name().toLowerCase(Locale.ROOT));
builder.field(NUMBER_OF_NODES, getNumberOfNodes());
builder.field(NUMBER_OF_DATA_NODES, getNumberOfDataNodes());
builder.field(DISCOVERED_CLUSTER_MANAGER, hasDiscoveredClusterManager());
builder.field(ACTIVE_PRIMARY_SHARDS, getActivePrimaryShards());
builder.field(ACTIVE_SHARDS, getActiveShards());
builder.field(RELOCATING_SHARDS, getRelocatingShards());
builder.field(INITIALIZING_SHARDS, getInitializingShards());
builder.field(UNASSIGNED_SHARDS, getUnassignedShards());
builder.field(DELAYED_UNASSIGNED_SHARDS, getDelayedUnassignedShards());
builder.field(NUMBER_OF_PENDING_TASKS, getNumberOfPendingTasks());
builder.field(NUMBER_OF_IN_FLIGHT_FETCH, getNumberOfInflightFetches());
builder.humanReadableField(TASK_MAX_WAIT_TIME_IN_QUEUE_IN_MILLIS, TASK_MAX_WAIT_TIME_IN_QUEUE, getMaxWaitTime());
builder.percentageField(ACTIVE_SHARDS_PERCENT_AS_NUMBER, ACTIVE_SHARDS_PERCENT, getActiveShardsPercent());
getAwarenessAttributeHealth().toXContent(builder, params);
builder.endObject();
return builder;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corresponding fromXContent?

Copy link
Contributor Author

@nishchay21 nishchay21 Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Majority is tested on service layer itself.

Copy link
Member

@imRishN imRishN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. Some minor comments, rest lgtm.

Comment on lines 53 to 55
if (awarenessAttributeName == null || awarenessAttributeName.isBlank()) {
return addValidationError("awareness attribute value seems to be missing", null);
}
Copy link
Collaborator

@gbbafna gbbafna Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw an exception from here. This should be a mandatory param . Ignore, since we are already doing that.

*
* @opensearch.internal
*/
public class ClusterAwarenessHealthRequestBuilder extends ClusterManagerNodeReadOperationRequestBuilder<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not used anywhere ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now not but have kept for test cases . Do you want this to be removed completely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#5694 (comment) Also on this it will throw a validation error

ClusterSettings currentSettings = clusterService.getClusterSettings();
String awarenessAttributeName = request.getAttributeName();
int numberOfPendingTasks = clusterService.getClusterManagerService().numberOfPendingTasks();
TimeValue maxWeightTime = clusterService.getClusterManagerService().getMaxTaskWaitTime();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo : maxWaitTime instead of maxWeightTime .

Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
@nishchay21 nishchay21 force-pushed the awareness-health-rest branch from ab55875 to 7439a0c Compare January 5, 2023 11:59
Comment on lines 97 to 99
if (in.getVersion().onOrAfter(Version.V_3_0_0)) {
awarenessAttribute = in.readOptionalString();
level = in.readEnum(Level.class);
Copy link
Collaborator

@Bukhtawar Bukhtawar Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.0 will require modifying this to 2.5 when you backport and then again modifying 2.5 to main. You can start with 2.5 and then directly backport, will be faster

case "shards":
this.level = ClusterHealthRequest.Level.SHARDS;
break;
case "awareness_attribute":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awareness_attribute(s)

if (level.equals(Level.AWARENESS_ATTRIBUTE) && indices.length > 0) {
return addValidationError("awareness_attribute is not a supported parameter with index health", null);
} else if (!level.equals(Level.AWARENESS_ATTRIBUTE) && awarenessAttribute != null) {
return addValidationError("level=awareness_attribute is required with awareness_attribute parameter", null);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awareness_attribute(s)

@@ -290,6 +332,7 @@ public ActionRequestValidationException validate() {
public enum Level {
CLUSTER,
INDICES,
SHARDS
SHARDS,
AWARENESS_ATTRIBUTE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWARENESS_ATTRIBUTE(S)

Comment on lines 248 to 255
this.clusterName = clusterName;
this.numberOfPendingTasks = numberOfPendingTasks;
this.numberOfInFlightFetch = numberOfInFlightFetch;
this.delayedUnassignedShards = delayedUnassignedShards;
this.taskMaxWaitingTime = taskMaxWaitingTime;
this.clusterStateHealth = new ClusterStateHealth(clusterState);
this.clusterAwarenessHealth = new ClusterAwarenessHealth(clusterState, clusterSettings, awarenessAttributeName);
this.clusterHealthStatus = clusterStateHealth.getStatus();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to

this(.....);
this.clusterAwarenessHealth = new ClusterAwarenessHealth(clusterState, clusterSettings, awarenessAttributeName);

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@@ -45,7 +45,8 @@
"options":[
"cluster",
"indices",
"shards"
"shards",
"awareness_attribute"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awareness_attribute(s) here and at all other places

@@ -106,6 +107,10 @@
"red"
],
"description":"Wait until cluster is in a specific state"
},
"awareness_attribute":{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an integ tests in ClusterHealthIT class and BWC tests

Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@gbbafna gbbafna merged commit 386ce9b into opensearch-project:main Jan 10, 2023
@gbbafna gbbafna added backport 2.x Backport to 2.x branch backport 2.5 labels Jan 10, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.5 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.5 2.5
# Navigate to the new working tree
pushd ../.worktrees/backport-2.5
# Create a new branch
git switch --create backport/backport-5694-to-2.5
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 386ce9b2fa64a959860cabaff478024dc8879ee4
# Push it to GitHub
git push --set-upstream origin backport/backport-5694-to-2.5
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.5

Then, create a pull request where the base branch is 2.5 and the compare/head branch is backport/backport-5694-to-2.5.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5694-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 386ce9b2fa64a959860cabaff478024dc8879ee4
# Push it to GitHub
git push --set-upstream origin backport/backport-5694-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-5694-to-2.x.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport 2.5
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants