-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] AwarenessAllocationIT fail due to inconsistent shard count in response #7401
Comments
I'll pick this up and resolve it by EOW next week or unassign myself. |
Investigated this issue. OpenSearch in a way follows greedy approach while allocating shards and doesn't compute the optimal allocation for all the shards that needs to be allocated. This approach based on certain filters and rules tries to control nodes where shards are assigned. The unassigned shards causing test failure is due to the same above reason where a node where the shard was supposed to be assigned created a conflict with the awareness allocation decider. Hence, it is stuck in a state, waiting for space to allocate the unassigned shard because it cannot assign it to the only node with space. This also seemed more likely to happen in this particular test case because it is creating a 15 nodes cluster and over 120 shards which increases the probability of landing up in such a case. A smaller cluster with lesser shards would be less likely to land up in such a case. This also seem to be a known issue after scrolling open issues in ElasticSearch/OpenSearch |
Closing the issue, test has been muted and will be fixed and enabled back after the fix in allocator |
This issue isn't fixed - the test is disabled. If we were to delete the test I'd be happy to close this issue; however, I suspect that we want to fix the underlying issue. If there is another issue tracking the scroll issue please link it here. |
@peternied, the merged PR which is muting the test links it to actual underlying issue. This is the merged PR muting the test - #11767. And the PR links the test to actual issue - #5908. Feel free to close the issue if this suffices, else we can keep this open |
Resolving as test has been muted and points the #5908 as the cause that needs to be fixed |
Describe the bug
org.opensearch.cluster.allocation.AwarenessAllocationIT.testThreeZoneOneReplicaWithForceZoneValueAndLoadAwareness
The text was updated successfully, but these errors were encountered: