Skip to content

Add granular error list to alias action response #106514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Apr 9, 2024

Conversation

parkertimmins
Copy link
Contributor

@parkertimmins parkertimmins commented Mar 19, 2024

This commit changes the behavior when calling POST _aliases to remove an alias that does not exist. If the remove action has must_exist==false (or is not present), and a list of actions succeeds partially, the response will now contain a list of results for each input action.

For example on the request:

POST _aliases
{
  "actions": [
    {
      "remove": {
        "index": "new_index",
        "alias": "non_existing_alias",
        "must_exist": false
      }
    },
    {
      "add": {
        "index": "new_index",
        "alias": "other_alias"
      }
    }
  ]
}

if the first action fails and the second action succeeds, the response will be:

{
   "acknowledged": true,
   "errors": true,
   "action_results": [
      {
        "action”: "remove",
        "success:" false,
        "error": "alias_not_found_exception"
      },
      {
        "action”: "add",
        "success:" true,
      }
   ]
}

Additionally, this commit changes the error message returned when removing an non-existent alias with must_exist==true. It now throws a alias_not_found_exception so the behavior is the same as when must_exist==false and all actions fail.

This closes #94478 , though is different than the behavior requested in that ticket.

This commit does the following:
* When an alias action list succeeds partially a list of results for each
  action are returned. This will only occur is must_exist==false
* Use same exception for removal of missing aliases whether must_exist
  is true or false.

Closes elastic#94478
@parkertimmins parkertimmins self-assigned this Mar 19, 2024
Copy link
Contributor

Documentation preview:

"aliases_not_found_exception"
);

private final AliasActions.Type actionType;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to return the full action from the request rather than just the type. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ++, nobody's going to remember what order they submitted what in, so even though you're returning things in the same order, you need to include enough information in the response for them to understand what failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does type (add/remove), indices, and aliases seem like enough information? After some digging, I've found returning the original request is problematic because it gets modified. I can add other fields that are important, but it seems better to return fewer fields, as adding fields is non-breaking, but removing/update fields is a breaking change.

@parkertimmins parkertimmins added :Data Management/Indices APIs APIs to create and manage indices and templates >feature labels Mar 20, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @parkertimmins, I've created a changelog YAML for you.

The current logic in this main loop is really hard to track. It is not
obvious that the action results are guaranteed to not be null. It would
also be easy to introduce a change that does make them null. Hopefully,
this changes makes to more clear that the alias action results lists is
built safely.
@parkertimmins parkertimmins marked this pull request as ready for review March 22, 2024 18:00
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Mar 22, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@@ -161,18 +165,21 @@ protected void masterOperation(
finalActions.add(new AddDataStreamAlias(alias, dataStreamName, action.writeIndex(), action.filter()));
}
}
actionResults.add(AliasActionResult.build(action, numAliasesRemoved));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't hurt anything, but it threw me off for a second that we're passing numAliasesRemoved when building an add action result. AliasActionResult.build correctly ignores it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that does look weird. I think it's worth using separate build and buildRemoveAlias builder functions, so it's a bit clearer.

Ideally we would return the AliasActions request directly to the users.
But the AliasActions object contains logic which we probably don't want
to expose in an api. Also,  index patterns are
resolved in the security layer (if its on), meaning the indices field of the
request does not contain the pattern provided by the user. To solve this
we store the original indices and do not update this field when the
security update the indices field
@@ -301,6 +317,9 @@ public AliasActions index(String index) {
throw new IllegalArgumentException("[index] can't be empty string");
}
this.indices = new String[] { index };
if (this.originalIndices == null) {
this.originalIndices = this.indices;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an originalAliases field which is created for the same purpose: to keep the aliases from the request before they are resolved by the security layer. Unfortunately, despite being analogous, orginalIndices and orginalAliases are handled slightly differently. This is because the indices is modified in the security layer by calling the indices() function a second time which updates the field; thus we must check if originalIndices is null to avoid setting it a second time. This differs from how aliases are set. They are set the first time (along with originalAliases) in the aliases() function, then only the aliases field (and not originalAliases) is updated in the replaceAliases() function.

Anyway, I'd prefer that this code behaved the same as the similar code for aliases, but I think making such a change would be a too far reaching change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how we have the potential for mismatching here. For example, if someone from an integration test were to do something like:

request.indices("index1", "index2");
…
request.index("index3");

(where they set it, then set it again later). Then originalIndices will be mismatched and will end up being ["index1","index2"].

Unfortunately, if it's the security layer that's making the change, I'm not sure what we can do. How bad would it be to use the security-substituted index names instead of the original names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, yeah if the request is reused this will not work correctly. I can't think of any way to store the unresolved value which will work with reuse of the request. (Well we could change the security layer's call from indices() to a new function replaceIndices() as is done for aliases, but that seems well out of scope).

Using security-substituted index names isn't horrible. Unfortunately, the index and alias names are not resolved in the request when it enters TransportIndicesAliasesAction if security is disabled. This class calls out to the resolver. To keep security and non-security behavior identical, some tracking of index/alias resolved names would need to be added to the already complicated main loop.

Do you think it would be crazy to just not include any information about the action in the response? And require the user correlate the result list with the request list? Joe correctly pointed out the that the user cannot be expected to still have the request around ... but no information might be better than inconsistent information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the originalIndices behavior as it is in this PR is more confusing and possibly incorrect. For now I think it'd be better to use the security-substituted index name, and perhaps track the originals differently the same work as #107017, what do you think?

@parkertimmins
Copy link
Contributor Author

Discussed with @dakrone and decided this api should behave more like the bulk api.

When error_trace is used with the bulk api, the stack trace is not returned for each error result within a 200 response. error_trace seems to only return a full stack trace when the response is a 400. The alias action api behaves the same, so is fine wrt error_trace.

On the other hand, a couple things should be changed to match the bulk api. The error string field, should be replaced with an actual exception object. This will be the same exception type as is thrown when must_exist==true or all actions fail. Also, the boolean success field should be a http status field.

For example this would the response to an alias action list with a failure and a success:

{
  "acknowledged": true,
  "errors": true,
  "action_results": [
    {
      "action": {
        "type": "remove",
        "indices": [
          "index1"
        ],
        "aliases": [
          "logs-non-existing"
        ]
      },
      "status": 404,
      "error": {
        "type": "aliases_not_found_exception",
        "reason": "aliases [logs-non-existing] missing",
        "resource.type": "aliases",
        "resource.id": "logs-non-existing"
      }
    },
    {
      "action": {
        "type": "add",
        "indices": [
          "index2"
        ],
        "aliases": [
          "logs-non-existing"
        ]
      },
      "status": 200
    }
  ]
}

@parkertimmins parkertimmins requested a review from masseyke April 1, 2024 18:00
@joegallo joegallo requested review from dakrone and removed request for joegallo April 2, 2024 15:02
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left just a few comments, but this is very close, thanks for iterating on the output format with me!

@@ -301,6 +317,9 @@ public AliasActions index(String index) {
throw new IllegalArgumentException("[index] can't be empty string");
}
this.indices = new String[] { index };
if (this.originalIndices == null) {
this.originalIndices = this.indices;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how we have the potential for mismatching here. For example, if someone from an integration test were to do something like:

request.indices("index1", "index2");
…
request.index("index3");

(where they set it, then set it again later). Then originalIndices will be mismatched and will end up being ["index1","index2"].

Unfortunately, if it's the security layer that's making the change, I'm not sure what we can do. How bad would it be to use the security-substituted index names instead of the original names?

import java.io.IOException;
import java.util.List;

public class IndicesAliasesResponse extends AcknowledgedResponse {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're missing an IndicesAliasesResponseTests class that extends AbstractWireSerializingTestCase to ensure we test the StreamInput/Output serialization.

If a request is reused, the originalIndices field will
be incorrect. Instead, use indices field. It will contain
the indices resolved by the security layer.
Resolved index names are already provided when security layer
is used. This adds resolved indices to alias action result
in all cases.
@@ -233,14 +245,18 @@ protected void masterOperation(
throw new IllegalArgumentException("Unsupported action [" + action.actionType() + "]");
}
}

Arrays.stream(concreteIndices).map(Index::getName).forEach(resolvedIndices::add);
actionResults.add(AliasActionResult.build(resolvedIndices, action, numAliasesRemoved));
Copy link
Contributor Author

@parkertimmins parkertimmins Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone
I found that rest yaml tests failed due to the differences in resolved index names when using security and unresolved names without security.

So I added this logic with resolvedIndices to use the actual indices when security is not enabled. This commit specifically.

If you think this is too messy, I'd suggest we remove action information entirely from the action result, leaving only status code and optional error info. The error will still contain the missing alias name which will probably be enough info to diagnose the issue.The action information could be added back in the follow-up once this class is a bit clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is reasonable, thanks

IndicesAliasesResponse could be instantiated with an empty error list
and set the errors field to true. Add some validation to make sure is
not instantiated this way.
Rather than checking exception equality exactly, just
make sure exception class and message are equal
There should be one AliasActionResult for every AliasActions, not
for every AliasAction. The former is the requested action on the
 unresolved alias pattern, which may map to multiple of the latter
 each of which is for a single specific alias
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for iterating on this @parkertimmins!

@parkertimmins parkertimmins merged commit 75228df into elastic:main Apr 9, 2024
@parkertimmins parkertimmins deleted the alias-remove-must-exist branch April 9, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >feature Team:Data Management Meta label for data/management team v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

/_aliases remove action ignores must_exist when alias is missing
5 participants