-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Introduce forget follower API #39718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.
Pinging @elastic/es-distributed |
* master: Remove beta label from CCR (elastic#39722) Rename retention lease setting (elastic#39719)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks good. I left some comments and questions.
|
||
public class ForgetFollowerAction extends Action<BroadcastResponse> { | ||
|
||
public static final String ACTION_NAME = "indices:admin/xpack/ccr/forget_follower"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not include /xpack/
in new actions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather to be consistent with all of CCR than introduce inconsistency there. We can address wholesale later, if needed.
.../plugin/core/src/main/java/org/elasticsearch/xpack/core/ccr/action/ForgetFollowerAction.java
Outdated
Show resolved
Hide resolved
public class BroadcastResponseTests extends ESTestCase { | ||
|
||
public void testFromXContent() throws IOException { | ||
final String index = randomAlphaOfLength(8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use AbstractXContentTestCase#xContentTester()
static method to test xcontent serialization?
I think it is a bit shorter and we have more test iterations by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that AbstractXContentTestCase
here is a lie. This is because what we really want to test is that if the REST layer sends back an XContent
serialized org.elasticsearch.action.support.broadcast.BroadcastResponse
then org.elasticsearch.client.core.BroadcastResponse
is able to de-serialize this XContent
. AbstractXContentTestCase
does not allow to test such a situation, instead only allowing you to go from concrete instance (of org.elasticsearch.client.core.BroadcastResponse
) to XContent
and then back to a concrete instance of (org.elasticsearch.client.core.BroadcastResponse
), and then checks if the result is equal. That is not what actually happens in production though, what happens in production is what I outlined above: a concrete instance of org.elasticsearch.action.support.broadcast.BroadcastResponse
is serialized on the REST layer which then has to be de-serialized on the HL client layer by org.elasticsearch.client.core.BroadcastResponse
. Hence, this test reflects that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think we should open an issue, so that goes over the existing hlrc response serialization tests and change those tests to test way is tested here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #39745.
@@ -395,6 +401,98 @@ public void onFailure(Exception e) { | |||
assertTrue(latch.await(30L, TimeUnit.SECONDS)); | |||
} | |||
|
|||
public void testForgetFollower() throws InterruptedException, IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the hlrc docs itself are missing? Test looks good though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 63721a4.
|
||
==== Authorization | ||
|
||
If the {es} {security-features} are enabled, you must have `manage_follow_index` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IndexPrivilege#MANAGE_FOLLOW_INDEX_AUTOMATON
constant isn't updated to include the forget follower api. Also this text should say: index privileges for the leader index.
instead of index privileges for the follower index.
.
I also I wonder whether manage_follow_index is the right privilege here. Because this is about a leases for the follower index in the leader index. I don't know a good alternative here. I just wanted to mention this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed e68b0c0.
@@ -104,7 +104,7 @@ public class RestTestsFromSnippetsTask extends SnippetsTask { | |||
* format of the response is incompatible i.e. it is not a JSON object. | |||
*/ | |||
static shouldAddShardFailureCheck(String path) { | |||
return path.startsWith('_cat') == false && path.startsWith('_ml/datafeeds/') == false | |||
return path.startsWith('_cat') == false && path.startsWith('_ml/datafeeds/') == false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are the changes to the groovy files needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the docs tests where we fire forget follower requests, it is entirely too hard to set up a request that is not going to fail. For example, to set up a request that is not going to fail, we have to know the index UUID of the follower. Obtaining that in a docs test is not possible to my knowledge, we do not have a way to set a value to use in a later docs test? Maybe I am missing something?
However, the docs infrastructure automatically adds is_false: _shards.failures
to every docs test. Therefore, I had to add a hook that enables us to say skip that check for these tests where we expect failures.
Again, if there's a way to avoid the failures that I am not seeing, please let me know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks for the explanation!
@SuppressWarnings("unchecked") final Map<String, Object> shardStatsAsMap = (Map<String, Object>)shardsStats.get(0); | ||
@SuppressWarnings("unchecked") final Map<String, Object> retentionLeasesStats = | ||
(Map<String, Object>) shardStatsAsMap.get("retention_leases"); | ||
@SuppressWarnings("unchecked") final ArrayList<Object> leases = (ArrayList<Object>)retentionLeasesStats.get("leases"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since no unfollow has happened, the shard follow task may add a new lease, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not since it is paused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I missed that pause follow index api call.
I made the above change, but the test is still failing, but now it fails because a lease is still returned, which isn't expected, as the leases should have been removed.
I also noticed that the specified --- a/x-pack/plugin/ccr/qa/security/src/test/java/org/elasticsearch/xpack/ccr/FollowIndexSecurityIT.java
+++ b/x-pack/plugin/ccr/qa/security/src/test/java/org/elasticsearch/xpack/ccr/FollowIndexSecurityIT.java
@@ -208,9 +208,10 @@ public class FollowIndexSecurityIT extends ESCCRRestTestCase {
"\"follower_cluster\":\"follow-cluster\"," +
"\"follower_index\":\"" + forgetFollower + "\"," +
"\"follower_index_uuid\":\"" + followerIndexUUID + "\"," +
- "\"leader_remote_cluster\":\"leader_cluster\"" +
+ "\"leader_remote_cluster\":\"leader-cluster\"" +
"}";
request.setJsonEntity(requestBody); But even with this change there is still a lease returned. I don't yet understand why that happens. |
@martijnvg |
@martijnvg The fix is 2e96e66. |
@elasticmachine run elasticsearch-ci/bwc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I've merged master into your branch, because that fixes the bwc test that was failing.
@martijnvg I pushed a commit that would have made the failure more apparent: 238c29b This gives:
|
Thanks for the really helpful review @martijnvg. |
This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.
This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.
This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.
* 6.7: Fix CCR HLRC docs Introduce forget follower API (elastic#39718) 6.6.2 release notes. Update release notes for 6.7.0 Add documentation for min_hash filter (elastic#39671) Unmute testIndividualActionsTimeout Unmute testFollowIndexAndCloseNode Use unwrapped cause to determine if node is closing (elastic#39723) Don’t ack if unable to remove failing replica (elastic#39584) Wipe Snapshots Before Indices in RestTests (elastic#39662) (elastic#39765) Bug fix for AnnotatedTextHighlighter (elastic#39525) Fix Snapshot BwC with Version 5.6.x (elastic#39737) Fix occasional SearchServiceTests failure (elastic#39697) Correct date in daterange-aggregation.asciidoc (elastic#39727) Add a note to purge the ingest-geoip plugin (elastic#39553)
This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.
Relates #37165