-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Awareness health api rest and transport layer changes #5694
Awareness health api rest and transport layer changes #5694
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #5694 +/- ##
============================================
+ Coverage 70.98% 71.04% +0.05%
- Complexity 58669 58769 +100
============================================
Files 4763 4771 +8
Lines 279945 280114 +169
Branches 40418 40425 +7
============================================
+ Hits 198731 199014 +283
+ Misses 64988 64842 -146
- Partials 16226 16258 +32
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
{ | ||
"cluster.awareness_attribute_health": { | ||
"documentation": { | ||
"url": "https://opensearch.org/docs/latest/opensearch/rest-api/awareness_attribute_health/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to open a documentation issue for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already done opensearch-project/documentation-website#2281
"url": "https://opensearch.org/docs/latest/opensearch/rest-api/awareness_attribute_health/", | ||
"description": "Returns basic information about the health of the cluster aggregated per awareness attribute." | ||
}, | ||
"stability": "experimental", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it experimental? I don't see API is behind feature flag. Is there any more work pending on this action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is discussed with gaurav as well . We will have this experimental and then GA later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets have a issue open to mark it stable later or the meta issues we are tracking?
@Override | ||
public RestStatus status() { | ||
return RestStatus.OK; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the default behaviour anyways, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then lets remove this override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is interface that is getting implemented we need to override the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterHealthResponse also has functionality for timeouts -
@Override
public RestStatus status() {
return isTimedOut() ? RestStatus.REQUEST_TIMEOUT : RestStatus.OK;
}
Do we need something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add that later if required. For now not a necessary parameter for API usage
ClusterAwarenessHealthRequest::new, | ||
indexNameExpressionResolver | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra line
ClusterState state, | ||
ActionListener<ClusterAwarenessHealthResponse> listener | ||
) { | ||
ClusterState currentState = clusterService.state(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state is already present in the method param
ActionListener<ClusterAwarenessHealthResponse> listener | ||
) { | ||
ClusterState currentState = clusterService.state(); | ||
ClusterSettings currentSettings = clusterService.getClusterSettings(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we inject it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cluster settings are required from cluster service which is already injected
if (awarenessAttributeName == null || awarenessAttributeName.isBlank()) { | ||
return addValidationError("awareness attribute value seems to be missing", null); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check if the requested attribute is actually set in the cluster settings for awareness attribute? I believe it might be already handled in the service layer, but we can fail fast here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is correct that is handled in the service layer fail fast wont be necessary as other output which is cluster level health will be seen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should throw an exception from here. This should be a mandatory param . Ignore, since we are already doing that.
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { | ||
builder.startObject(); | ||
builder.field(CLUSTER_NAME, getClusterName()); | ||
builder.field(STATUS, getStatus().name().toLowerCase(Locale.ROOT)); | ||
builder.field(NUMBER_OF_NODES, getNumberOfNodes()); | ||
builder.field(NUMBER_OF_DATA_NODES, getNumberOfDataNodes()); | ||
builder.field(DISCOVERED_CLUSTER_MANAGER, hasDiscoveredClusterManager()); | ||
builder.field(ACTIVE_PRIMARY_SHARDS, getActivePrimaryShards()); | ||
builder.field(ACTIVE_SHARDS, getActiveShards()); | ||
builder.field(RELOCATING_SHARDS, getRelocatingShards()); | ||
builder.field(INITIALIZING_SHARDS, getInitializingShards()); | ||
builder.field(UNASSIGNED_SHARDS, getUnassignedShards()); | ||
builder.field(DELAYED_UNASSIGNED_SHARDS, getDelayedUnassignedShards()); | ||
builder.field(NUMBER_OF_PENDING_TASKS, getNumberOfPendingTasks()); | ||
builder.field(NUMBER_OF_IN_FLIGHT_FETCH, getNumberOfInflightFetches()); | ||
builder.humanReadableField(TASK_MAX_WAIT_TIME_IN_QUEUE_IN_MILLIS, TASK_MAX_WAIT_TIME_IN_QUEUE, getMaxWaitTime()); | ||
builder.percentageField(ACTIVE_SHARDS_PERCENT_AS_NUMBER, ACTIVE_SHARDS_PERCENT, getActiveShardsPercent()); | ||
getAwarenessAttributeHealth().toXContent(builder, params); | ||
builder.endObject(); | ||
return builder; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corresponding fromXContent
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Majority is tested on service layer itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. Some minor comments, rest lgtm.
if (awarenessAttributeName == null || awarenessAttributeName.isBlank()) { | ||
return addValidationError("awareness attribute value seems to be missing", null); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should throw an exception from here. This should be a mandatory param . Ignore, since we are already doing that.
* | ||
* @opensearch.internal | ||
*/ | ||
public class ClusterAwarenessHealthRequestBuilder extends ClusterManagerNodeReadOperationRequestBuilder< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not used anywhere ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now not but have kept for test cases . Do you want this to be removed completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#5694 (comment) Also on this it will throw a validation error
ClusterSettings currentSettings = clusterService.getClusterSettings(); | ||
String awarenessAttributeName = request.getAttributeName(); | ||
int numberOfPendingTasks = clusterService.getClusterManagerService().numberOfPendingTasks(); | ||
TimeValue maxWeightTime = clusterService.getClusterManagerService().getMaxTaskWaitTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo : maxWaitTime
instead of maxWeightTime
.
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
ab55875
to
7439a0c
Compare
if (in.getVersion().onOrAfter(Version.V_3_0_0)) { | ||
awarenessAttribute = in.readOptionalString(); | ||
level = in.readEnum(Level.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.0 will require modifying this to 2.5 when you backport and then again modifying 2.5 to main. You can start with 2.5 and then directly backport, will be faster
case "shards": | ||
this.level = ClusterHealthRequest.Level.SHARDS; | ||
break; | ||
case "awareness_attribute": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awareness_attribute(s)
if (level.equals(Level.AWARENESS_ATTRIBUTE) && indices.length > 0) { | ||
return addValidationError("awareness_attribute is not a supported parameter with index health", null); | ||
} else if (!level.equals(Level.AWARENESS_ATTRIBUTE) && awarenessAttribute != null) { | ||
return addValidationError("level=awareness_attribute is required with awareness_attribute parameter", null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awareness_attribute(s)
@@ -290,6 +332,7 @@ public ActionRequestValidationException validate() { | |||
public enum Level { | |||
CLUSTER, | |||
INDICES, | |||
SHARDS | |||
SHARDS, | |||
AWARENESS_ATTRIBUTE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AWARENESS_ATTRIBUTE(S)
this.clusterName = clusterName; | ||
this.numberOfPendingTasks = numberOfPendingTasks; | ||
this.numberOfInFlightFetch = numberOfInFlightFetch; | ||
this.delayedUnassignedShards = delayedUnassignedShards; | ||
this.taskMaxWaitingTime = taskMaxWaitingTime; | ||
this.clusterStateHealth = new ClusterStateHealth(clusterState); | ||
this.clusterAwarenessHealth = new ClusterAwarenessHealth(clusterState, clusterSettings, awarenessAttributeName); | ||
this.clusterHealthStatus = clusterStateHealth.getStatus(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to
this(.....);
this.clusterAwarenessHealth = new ClusterAwarenessHealth(clusterState, clusterSettings, awarenessAttributeName);
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
@@ -45,7 +45,8 @@ | |||
"options":[ | |||
"cluster", | |||
"indices", | |||
"shards" | |||
"shards", | |||
"awareness_attribute" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awareness_attribute(s)
here and at all other places
@@ -106,6 +107,10 @@ | |||
"red" | |||
], | |||
"description":"Wait until cluster is in a specific state" | |||
}, | |||
"awareness_attribute":{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an integ tests in ClusterHealthIT
class and BWC tests
Signed-off-by: Nishchay Malhotra <nishcha@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.5 2.5
# Navigate to the new working tree
pushd ../.worktrees/backport-2.5
# Create a new branch
git switch --create backport/backport-5694-to-2.5
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 386ce9b2fa64a959860cabaff478024dc8879ee4
# Push it to GitHub
git push --set-upstream origin backport/backport-5694-to-2.5
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.5 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5694-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 386ce9b2fa64a959860cabaff478024dc8879ee4
# Push it to GitHub
git push --set-upstream origin backport/backport-5694-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
Gradle Check (Jenkins) Run Completed with:
|
Description
Adding rest and transport level changes for below approved PR #5231
Issues Resolved
#5230.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.