Skip to content

Add Query Parameters to Cluster Allocation Explain API #129342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

joshua-adams-1
Copy link
Contributor

@joshua-adams-1 joshua-adams-1 commented Jun 12, 2025

The cluster allocation explain API now accepts parameters via the request body, via query parameters passed in the URL, but not via both.

#127028

The cluster allocation explain API now accepts parameters via the
request body, via path parameters passed in the URL, but not via both.

Issue: elastic#127028
@joshua-adams-1 joshua-adams-1 self-assigned this Jun 12, 2025
@joshua-adams-1 joshua-adams-1 added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Coordination Meta label for Distributed Coordination team v9.1.0 labels Jun 12, 2025
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry got called away mid-review. I left a couple of comments, will come back to this.

@@ -44,6 +44,7 @@
import static org.mockito.Mockito.when;

public class RestRequestTests extends ESTestCase {
private static final String PARAMETER_KEY = "PARAMETER_KEY";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indicates to me that this literal parameter name is important in these tests, but I think that's not the case. Instead, we should use randomIdentifier() (within each test, it won't work in static context) to indicate to readers that the parameter name doesn't matter as long as it's equal in all the places it needs to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 77080cb

Comment on lines 61 to 63
if (hasAnyParameterBeenPassed(request)) {
throw new IllegalArgumentException("Parameters cannot be passed in both the URL and the request body");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? I would expect that if the request contains a parameter which isn't consumed in this method then that'll automatically throw an IAE - see org.elasticsearch.rest.BaseRestHandler#handleRequest, particularly unconsumedParams.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, but this terminates the flow early before the body is parsed. In the case where parameters are passed alongside a request body and the request body contains an incorrect type, then the user would get the parsing exception thrown first. If they fixed that, then they would get the error about having parameters in the URL. I felt it more logical to be told: 1) you can't have parameters in both the body and the URL so pick one, and then 2) iterate through the errors found in the request body / parameters, whichever route selected. If this goes against conventions found in other APIs then happy to maintain consistency and remove this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a convention for this, at least not a very widely-implemented one. I doubt a user would ever really make these multiple mistakes in practice, and really doubt that they would care about having to solve the problems one-by-one; in contrast we're definitely going to be re-reading this code in future, so this is somewhere that we should prioritise simplicity of the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 96e219b

Comment on lines 27 to 31
"description": "Specifies the index within which we would like to describe the shard"
},
"shard": {
"type": "number",
"description": "Specifies the ID of the shard that you would like an explanation for"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar nit: "we" or "you"? Tho other similar docs write this without needing these pronouns, maybe we can rephrase this.

Also (and this is really nitpicking to the max) avoid the dangling preposition: "the ID of the shard that you would like an explanation for" -> "the ID of the shard for which you would like an explanation"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I coped the description for shard verbatim from the existing API docs, here. I made up the description for index so will update it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 84817db

@@ -5,7 +5,105 @@
cluster.allocation_explain: {}

---
"cluster shard allocation explanation test":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest keeping this test - it's not doing any harm IMO and without it the diff is awful.

Copy link
Contributor Author

@joshua-adams-1 joshua-adams-1 Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this test to cluster shard allocation explanation test with 3 valid body parameters but I added tests above and below it, copying parts of this test as a template, and git has made a horrible diff of it.

To summarise my changes in this file:

  1. I kept all existing tests, but I may have renamed them to specify exactly what they were testing.
  2. I added extensive testing to the old functionality including:
    2.1 Insufficient parameters passed in the body request since index, shard and primary are required despite not being explicitly stated in the API docs.
    2.2 Parameters of invalid types passed and throwing errors
  3. I added tests for path parameter support, including:
    3.1 Insufficient parameters passed
    3.2 Parameters of invalid types passed

For 2.1 and 2.2, if this is tested elsewhere then please point me to it! I felt this an appropriate place to add these tests, and increased test coverage is always good, even if I've added hundreds of line of very similar, repetitive tests. If this can be parameterized somehow, then again I'm happy to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see the new tests, they seem very thorough, I just recommend keeping this test as-is (including its name). Clean diffs are valuable when digging back through history looking for bugs. If you feel strongly about the naming then that can happen in a separate PR (maybe to go in first).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some pre-requisite tests, #130381, and now the diff is much cleaner

@@ -478,6 +478,18 @@ public int paramAsInt(String key, int defaultValue) {
}
}

public Integer paramAsInt(String key, Integer defaultValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit: paramAsInteger?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 84817db

);
}

// This checks for optionally provided query parameters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, but it is the common pattern across all REST handlers (without needing comments elsewhere) so this sort of comment may actually cause more confusion: why is it called out here when this is just the common pattern? Is something different going on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 84817db

@pxsalehi

This comment was marked as off-topic.

@pxsalehi
Copy link
Member

@elasticmachine update branch

@pxsalehi pxsalehi added v9.0.3 auto-backport Automatically create backport pull requests when merged and removed auto-backport Automatically create backport pull requests when merged v9.0.3 labels Jun 13, 2025
@pxsalehi
Copy link
Member

I think your yaml tests are failing on older ES versions where the parameters are not supported. You'd need to mark your yaml tests so they're skipped on older versions. See this doc.

@pxsalehi
Copy link
Member

last time I did this, it was all version-based. It seems now we don't do that anymore and skipping yaml tests need to use the other options in the doc I linked above. From the looks of it, the capabilities/parameters one is related to what you do. I found one example where this is used that you can use as an example.

@joshua-adams-1 joshua-adams-1 force-pushed the feature/allocation-explain-query-string-parameter-127028 branch from 47e5479 to a8042af Compare June 13, 2025 13:37
- is_true: rebalance_explanation

---
# Is this incorrect behaviour? Should an InvalidArgumentException be returned here?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll call attention to this since this is current, production behaviour but I wasn't sure if it was correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShardNotFoundException extends ElasticsearchException which according to the ExceptionHelper here, does not return a RestStatus.BAD_REQUEST, which may mean it's not converted to IAE which I would expect. Happy to be corrected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShardNotFoundException is a ResourceNotFoundException and ResourceNotFoundException#status returns RestStatus#NOT_FOUND which maps onto a 404 Not Found response code. That sounds correct to me.

- match: { acknowledged: true }

- do:
# Numerical values passed here are implicitly supported by RestRequest parsing all parameters as strings
Copy link
Contributor Author

@joshua-adams-1 joshua-adams-1 Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this out since the API claims to only support strings for this field, but the way we parse the parameters means we can support integer values as strings, such as "1". Worth adding to the API documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a convention across the whole API surface, documented here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, in this case it was the other way around.
The field is declared as string, but I am able to pass an integer. This is because RestRequest parses all parameters as strings. It seems counter-intuitive to be able to pass integers or floats or doubles when the API is expecting a string. In any of these cases, we don't get an IAE, we get an index_not_found_exception since the API has parsed the numerical value as a string and then searched for an index named accordingly. Either way, this is a separate issue to this PR

- is_true: rebalance_explanation

---
# When a float is passed in the body, a ShardNotFoundException is thrown. This behaviour is corrected when passed via a path parameter
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to a comment above about potential incorrect behaviour when a float is passed. I have changed the behaviour when one is passed as a path parameter since I assumed this to be logical to throw an IAE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather this were consistent. You're right, IllegalArgumentException seems preferable if shard isn't an integer. If that's a change in behaviour for the body-params case, let's fix that up first (in a separate PR) and then there's no concerns here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR to fix this is #130389

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I wrote this comment I was under the impression that a float would always result in a ShardNotFoundException. It seems that was wrong - instead AIUI a float argument in the body gets truncated to an integer, and if this integer is a valid shard number then you get a result for that shard rather than a ShardNotFoundException. That means we can't introduce an IllegalArgumentException here as I'd previously suggested.

In contrast, the parsing of query parameters is stricter and will reject a float-formatted argument with an IllegalArgumentException anyway.

Consistency is important, but there is already an inconsistency across the whole ES API that seems irreconcilable (without a breaking change, or significantly more effort than this corner case deserves). I think we have to live with it for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a PR #130738 to mark the AbstractObjectParser with a V10 annotation to throw an IllegalArgumentException when, in this case, non-integer values are passed. This change will update the entirety of the ES API. In the meantime, for API requests with a request body, we will continue to support the existing behaviour.

I am also going to keep the stricter query parameter validation since this is consistent with other APIs.

In summary:

  • Floats / doubles passed in request body -> truncated to integers
  • Floats / doubles passed in query parameters -> IllegalArgumentException

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've used the phrase "path parameters" in several places (including the PR title) but in fact these are query parameters. See e.g. RFC 3986 §3:

  URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

Could you fix that? Other replies inline.

@@ -5,7 +5,105 @@
cluster.allocation_explain: {}

---
"cluster shard allocation explanation test":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see the new tests, they seem very thorough, I just recommend keeping this test as-is (including its name). Clean diffs are valuable when digging back through history looking for bugs. If you feel strongly about the naming then that can happen in a separate PR (maybe to go in first).

- is_true: rebalance_explanation

---
# Is this incorrect behaviour? Should an InvalidArgumentException be returned here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShardNotFoundException is a ResourceNotFoundException and ResourceNotFoundException#status returns RestStatus#NOT_FOUND which maps onto a 404 Not Found response code. That sounds correct to me.

- match: { acknowledged: true }

- do:
# Numerical values passed here are implicitly supported by RestRequest parsing all parameters as strings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a convention across the whole API surface, documented here.

- is_true: rebalance_explanation

---
# When a float is passed in the body, a ShardNotFoundException is thrown. This behaviour is corrected when passed via a path parameter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather this were consistent. You're right, IllegalArgumentException seems preferable if shard isn't an integer. If that's a change in behaviour for the body-params case, let's fix that up first (in a separate PR) and then there's no concerns here.

Comment on lines 61 to 63
if (hasAnyParameterBeenPassed(request)) {
throw new IllegalArgumentException("Parameters cannot be passed in both the URL and the request body");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a convention for this, at least not a very widely-implemented one. I doubt a user would ever really make these multiple mistakes in practice, and really doubt that they would care about having to solve the problems one-by-one; in contrast we're definitely going to be re-reading this code in future, so this is somewhere that we should prioritise simplicity of the implementation.

@joshua-adams-1 joshua-adams-1 changed the title Add Path Parameters to Cluster Allocation Explain API Add Query Parameters to Cluster Allocation Explain API Jul 8, 2025
@joshua-adams-1 joshua-adams-1 force-pushed the feature/allocation-explain-query-string-parameter-127028 branch from a8572cb to bc11bdd Compare July 8, 2025 14:54
@@ -22,6 +22,22 @@
]
},
"params":{
"index": {
"type": "string",
"description": "Specifies the name of the index that you would like an explanation for"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of nits in the tests.

primary: "truee"

---
"cluster shard allocation explanation test with numerical current node parameter passed in URL":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is kinda misleading - there's nothing invalid about current_node: 1 as the title suggests, the problem is the primary: 0, and we're already testing that elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e87058d

body: { "shard": 0, "primary": true }

---
"cluster shard allocation explanation test with numerical index parameter passed in URL":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one's a bit odd too - it's no different from the index: "test2" in the next test case, there's nothing special about the index name 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e87058d

assertEquals(123, value);
}

public void testParamAsIntWithoutIntegerParameter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit: "without" suggests the parameter is absent, but in fact it's present, it's just that its value is not numeric (likewise below)

Suggested change
public void testParamAsIntWithoutIntegerParameter() {
public void testParamAsIntWithNonIntegerParameter() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e87058d

RestRequest restRequest = contentRestRequest("", singletonMap(parameterKey, "123T"));
int defaultValue = randomInt();

assertThrows(IllegalArgumentException.class, () -> { restRequest.paramAsInt(parameterKey, defaultValue); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for the {}, we can use an expression lambda here (also in other cases below)

Suggested change
assertThrows(IllegalArgumentException.class, () -> { restRequest.paramAsInt(parameterKey, defaultValue); });
assertThrows(IllegalArgumentException.class, () -> restRequest.paramAsInt(parameterKey, defaultValue));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e87058d

@joshua-adams-1 joshua-adams-1 force-pushed the feature/allocation-explain-query-string-parameter-127028 branch from 0ce9e43 to cd00b33 Compare July 11, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants