Skip to content

Auto-allocate searchable snapshots #52527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

@DaveCTurner DaveCTurner commented Feb 19, 2020

This commit allows plugins to supply their own allocator for existing shards,
generalizing the default GatewayAllocator. It uses this to implement a very
simple auto-allocation process for searchable snapshots which respects the
allocation deciders but which can otherwise be assigned to any node.

Relates #50999

This commit allows plugins to supply their own allocator for existing shards,
generalizing the default `GatewayAllocator`. It uses this to implement a very
simple auto-allocation process for searchable snapshots which respects the
allocation deciders but which can otherwise be assigned to any node.
@DaveCTurner DaveCTurner added >enhancement WIP :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Feb 19, 2020
@DaveCTurner DaveCTurner requested a review from ywelsch February 19, 2020 17:20
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@DaveCTurner DaveCTurner requested a review from tlrx February 19, 2020 17:20
Copy link
Contributor Author

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm opening this for initial comments before I get too deep into adding/modifying tests to match the changes here; it should pass the existing test suite at least.

@DaveCTurner
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/2

@DaveCTurner
Copy link
Contributor Author

I forgot to mention that of course we expect to have a more sophisticated mechanism in future with async fetching of information from the individual nodes before making a decision. This PR is only focussed on the plumbing needed to expose this kind of allocation to plugins.

@DaveCTurner
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/2

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dave, I read through it mainly out of interest. I added a few comments anyway, feel free to dismiss me quickly if it makes no sense 😃

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some thoughts similar to what Henning said

@DaveCTurner
Copy link
Contributor Author

I've taken this in the direction suggested by both @ywelsch and @henningandersen, added some tests, and removed the WIP label. This is ready for a proper review now.

@DaveCTurner DaveCTurner mentioned this pull request Feb 21, 2020
19 tasks
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some smaller comments, overall looking good. Should we also have some basic tests for SearchableSnapshotAllocator or do you want to leave that to a follow-up?

private ExistingShardsAllocator getAllocatorForShard(ShardRouting shardRouting, RoutingAllocation routingAllocation) {
assert assertInitialized();
final String allocatorName = ExistingShardsAllocator.EXISTING_SHARDS_ALLOCATOR_SETTING.get(
routingAllocation.metaData().getIndexSafe(shardRouting.index()).getSettings());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as these lookups can be expensive in case there are many unassigned shards, I wonder if we should pass the retrieved IndexMetaData to the ExistingShardsAllocator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It involves passing the IndexMetaData around to a lot of places. I mean we could do that if you're sure this is sufficiently expensive, but it seems surprising to me that this would be a concern.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a number of smaller comments and a question on how the balancer and searchable snapshot allocation will interact.

@DaveCTurner
Copy link
Contributor Author

Hey folks, any more comments on this PR?

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

: clusterPlugin.getExistingShardsAllocators().entrySet()) {
final String allocatorName = existingShardsAllocatorEntry.getKey();
if (existingShardsAllocators.put(allocatorName, existingShardsAllocatorEntry.getValue()) != null) {
throw new IllegalArgumentException("ExistingShardsAllocator [" + allocatorName + "] already defined");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include clusterPlugin.getClass().getName() into the message?

* Called before starting a round of allocation, allowing the allocator to invalidate some caches if appropriate.
*/
void beforeAllocation(RoutingAllocation allocation);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove extra empty line.

while (unassignedIterator.hasNext()) {
final ShardRouting shard = unassignedIterator.next();
final AllocateUnassignedDecision allocateUnassignedDecision = makeAllocationDecision(shard, allocation, logger);
public void allocateUnassigned(RoutingAllocation allocation, ShardRouting shardRouting,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: swap argument order to make it the same as in ExistingShardsAllocator:

Suggested change
public void allocateUnassigned(RoutingAllocation allocation, ShardRouting shardRouting,
public void allocateUnassigned(ShardRouting shardRouting, RoutingAllocation allocation,

@DaveCTurner DaveCTurner merged commit 215e94d into elastic:feature/searchable-snapshots Mar 11, 2020
@DaveCTurner DaveCTurner deleted the 2020-02-19-pluggable-allocator-for-existing-shards branch March 11, 2020 15:04
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Mar 16, 2020
In elastic#50239 we introduced a temporary workaround in Store#tryOpenIndex to allow
searchable snapshots shards to be allocated anywhere. elastic#52527 reverts this
workaround but did not revert the associated test changes. This commit reverts
the associated test changes.
DaveCTurner added a commit that referenced this pull request Mar 16, 2020
In #50239 we introduced a temporary workaround in Store#tryOpenIndex to allow
searchable snapshots shards to be allocated anywhere. #52527 reverts this
workaround but did not revert the associated test changes. This commit reverts
the associated test changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants