-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Add allocate_all_primaries to cluster reroute #4285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've confirmed this works using the local gateway:
curl -XDELETE "http://localhost:9200/test?pretty" -s
curl -XPOST "http://localhost:9200/test?pretty" -s -d '{
"settings": {
"index": {
"number_of_shards": 5,
"number_of_replicas": 0
}
}
}'
for i in {1..100}; do
curl -XPOST "http://localhost:9200/test/test?pretty" -d '{"foo": "1"}' -s
done
for i in {1..100}; do
curl -XPOST "http://localhost:9200/test/test?pretty" -d '{"foo": "1"}' -s
done
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [
{
"allocate_all_primaries" : {}
}
]
}'
for i in {1..100}; do
curl -XPOST "http://localhost:9200/test/test?pretty" -d '{"foo": "1"}' -s
done The data is lost but at least you don't have timeouts. Github's markup is making a mess of this.... |
It would be interested to check somehow if the primary allocation is just being throttled from being allocated to a node, and in which case, not force the allocation.... . This will require to shard knowledge somehow with |
@kimchy, I understand what you are saying but I'm not sure how I'd go about it. It does make me think of something else: will this force allocation and ignore throttling? Is that OK if we're allocating thousands of shards? I can have a look at implementing what you mention sometime in the next few days. |
@nik9000 this force allocation will not end up ignoring throttling, it will just come back to being allocated and respect throttling. |
That, at least, is great news. I can imagine folks in a disaster repeatedly trying this over and over again which won't help. I'll make sure that it refuses to do anything if all the unallocated primaries are throttled. I'll see about spitting out a different error message in that case so people know that all shards are in the process of being allocated. |
So I had a look at this and I'm not really sure how to do this because the decision about which node to assign the shard comes after allocation commands are run. I wonder if it'd be simpler to store the list of throttled shards in the cluster state and dig it back out again during the allocation command.... |
To be honest, I don't have a good idea about how to do it yet as well :), I will try and spend some time thinking about it and provide feedback soonish (sorry!). |
I thought I could get this from the AllocationExplanation on ClusterState but that always seems to be empty. I actually can't find any code that sets it. |
From the docs: `allocate_all_primaries`:: Allocate all unallocated primaries to any node that can take them. Accepts no parameters. Each allocation is similar to running `allocate` with `allow_primary` so this can cause data loss. This is useful in the same cases as `allocate` with `allow_primary` but doesn't require looking up the `index` or `shard` or guessing an appropriate `node`. Closes elastic#4206
Pushed revised version - doesn't do what @kimchy wanted yet but is a bit nicer any way. |
I haven't looked at this in a long while. I imagine this would still be useful but don't have much time to think about it recently. Any interest in me resurrecting this? |
I have exactly this problem, I have just one shard and sometimes when I restart and look at the health of my cluster, I get this for one of my indexes: I know that if I delete the index, the problem will go away, but that's not the optimal solution. Is there a solution for this problem? Are this changes here a solution for my problem? Thanks |
Any plans of merging this? We run into issues with "unassigned shards" occasionally and it would be great to have a feature like this. |
@nik9000 Is this still on your radar? I think this new allocation command is useful. Just thinking out loud here about how to detect if a node is throttling the primary shard allocation:
boolean found = false;
for (MutableShardRouting routing : allocation.routingNodes().unassigned()) {
DiscoveryNode nodeHoldingHigestShardVersion = newHelper.findNodeWithHighestShardVersion();
Decision decision = Decision.YES;
if (nodeHoldingHigestShardVersion != null) {
RoutingNode routingNode = allocation.routingNodes().node(nodeHoldingHigestShardVersion.id());
decision = allocation.deciders().canAllocate(routing, routingNode, allocation);
}
if (decision.type() != Decision.Type.THROTTLE && routing.primary()) {
found = true;
// Just clear the post allocation flag to the shard so it'll assign itself.
allocation.routingNodes().addClearPostAllocationFlag(routing.shardId());
}
}
if (!found) {
throw new ElasticsearchIllegalArgumentException("[allocate_all_primaries] no unassigned primaries");
} This way throttled primary allocation will not be affected by the new command. |
This has sunk pretty low on my radar. So low I haven't actually been checking the status and the ping must have slipped by me. I can pick it up at some point but if you want it quickly maybe you can grab it? If my code is a good starting point you can have it. Or start over - I won't be offended - the pull request is really stale. |
@nik9000 I labeled it accordingly such that it won't get forgotten and will be picked up at some point thanks for pinging again. |
The pain in allocating many primary shards is finding a place to put them, so suggestion:
|
+1 This plan looks good. On 10 October 2014 11:48, Clinton Gormley notifications@github.com wrote:
Met vriendelijke groet, Martijn van Groningen |
@nik9000 please allow me to buy you a 🍺 or ☕ next time you're in Portland, OR. Great little improvement to ES right here. 👍 |
+1 this would still be great ;) |
+1 really needed. |
@soundofjw @damm what version of Elasticsearch are you using? I asked our support team just a few days ago if they still think that this functionality would be useful. Their response was that, with recent versions, the need for this has pretty much disappeared. |
@clintongormley I'm using 1.7.0; I still have issues where I break out the bash scripts in this pull request. Single node recently; but a month ago on a cluster actually. Not common but it happens enough that I don't forget it. |
@clintongormley Pretty much same - 1.7.0 as well. There are few times that we need to do this, but it usually happens when setting up a cluster for the first time, or making big changes. |
+1 to making big changes; I had to break this out when I had a cluster that was not allocating based on available space and it was making one node run out of space. Had to re-route a bunch of data quickly while waiting for Elasticsearch to balance itself out once there was enough free space. |
@soundofjw why would you need this when setting up a cluster for the first time, or making big changes? The only time you should need this is when you lose ALL copies of many shards (primaries and replicas) - and you want to force allocation of new empty shard copies. |
@clintongormley It's like a bricked phone with no factory reset button. (no new index creation, no inserts, no fix button...) For example installing the nodes with a deployment system (ex. chef, puppet, andsible..), you might deploy to all nodes at once since you don't care yet about down times etc. somehow it reaches such a state.. I had that multiply times doing a new cluster setup (redeploying nodes again and again) + after a few hours of work, not sure why. It should just be a loop of existing commands... |
@clintongormley +1 to what @ofir-petrushka and @damm are saying. One issue I've seen more than once is when the cluster resets state due to all masters resetting - and then data nodes come and recover shards which are no longer recognized. You'll see a lot of "# of documents mismatch" in this case. |
This issue should be fixed in 2.0 with #9952 |
@clintongormley Awesome! That's great news 👍 |
@clintongormley just hit this with 2.1 :/ |
@damm you want to be more specific? |
@clintongormley had to reroute all my primary shards after a failed 2.1 upgrade from 2.0 Had to modify the scripts to make it happy. |
@damm I'm much more interested in why the 2.1 upgrade failed for you. Was it something wrong with 2.1 or something that you did? If the former, please open a separate issue explaining the problem. |
I'm going to close this PR as it is way out of date, and I think that the use for it is now infrequent. |
From the docs:
allocate_all_primaries
::Allocate all unallocated primaries to any node that can take them.
Accepts no parameters. Each allocation is similar to running
allocate
with
allow_primary
so this can cause data loss. This is useful in thesame cases as
allocate
withallow_primary
but doesn't require lookingup the
index
orshard
or guessing an appropriatenode
.Closes #4206