Skip to content

[BUG][Segment Replication] ReplicationFailedException and ALLOCATION_FAILED #9966

Closed
@kksaha

Description

Describe the bug

Shard failure, reason [replication failure]], failure [ReplicationFailedException
Failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException.||

Logs:

[2023-09-08T09:28:10,201][WARN ][o.o.c.r.a.AllocationService] [master-2] failing shard [failed shard, shard [ppe-000298][0], node[ylgTCi8VSk-iytumnaGxlg], [R], s[STARTED], a[id=1OR-X-XDTrCMLvXWr8k0sw], message [shard failure, reason [replication failure]], failure [ReplicationFailedException[[ppe-000298][0]: Replication failed on (failed to clean after replication)]; nested: CorruptIndexException[Problem reading index. (resource=/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe)]; nested: NoSuchFileException[/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe]; ], markAsStale [true]] [2023-09-08T09:28:10,201][WARN ][o.o.c.r.a.AllocationService] [master-2] failing shard [failed shard, shard [ppe-000298][0], node[ylgTCi8VSk-iytumnaGxlg], [R], s[STARTED], a[id=1OR-X-XDTrCMLvXWr8k0sw], message [shard failure, reason [replication failure]], failure [ReplicationFailedException[[ppe-000298][0]: Replication failed on (failed to clean after replication)]; nested: CorruptIndexException[Problem reading index. (resource=/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe)]; nested: NoSuchFileException[/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe]; ], markAsStale [true]] [2023-09-08T09:28:20,522][WARN ][o.o.c.r.a.AllocationService] [master-2] failing shard [failed shard, shard [ppe-000298][0], node[EDhutdeXT5W5luFLpIF7sw], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=5cUw3rGZSbuWLOSrQygkvA], unassigned_info[[reason=ALLOCATION_FAILED], at[2023-09-08T09:28:10.201Z], failed_attempts[1], delayed=false, details[failed shard on node [ylgTCi8VSk-iytumnaGxlg]: shard failure, reason [replication failure], failure ReplicationFailedException[[ppe-000298][0]: Replication failed on (failed to clean after replication)]; nested: CorruptIndexException[Problem reading index. (resource=/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe)]; nested: NoSuchFileException[/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe]; ], allocation_status[no_attempt]], expected_shard_size[13863289464], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ppe-000298][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [19241322ms]]; ], markAsStale [true]] [2023-09-08T09:28:20,522][WARN ][o.o.c.r.a.AllocationService] [master-2] failing shard [failed shard, shard [ppe-000298][0], node[EDhutdeXT5W5luFLpIF7sw], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=5cUw3rGZSbuWLOSrQygkvA], unassigned_info[[reason=ALLOCATION_FAILED], at[2023-09-08T09:28:10.201Z], failed_attempts[1], delayed=false, details[failed shard on node [ylgTCi8VSk-iytumnaGxlg]: shard failure, reason [replication failure], failure ReplicationFailedException[[ppe-000298][0]: Replication failed on (failed to clean after replication)]; nested: CorruptIndexException[Problem reading index. (resource=/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe)]; nested: NoSuchFileException[/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe]; ], allocation_status[no_attempt]], expected_shard_size[13863289464], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ppe-000298][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [19241322ms]]; ], markAsStale [true]] [2023-09-08T09:28:30,607][WARN ][o.o.c.r.a.AllocationService] [master-2] failing shard [failed shard, shard [ppe-000298][0], node[ylgTCi8VSk-iytumnaGxlg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=A7J2Jm_7SAOY2RLIXH-qZA], unassigned_info[[reason=ALLOCATION_FAILED], at[2023-09-08T09:28:20.522Z], failed_attempts[2], failed_nodes[[EDhutdeXT5W5luFLpIF7sw]], delayed=false, details[failed shard on node [EDhutdeXT5W5luFLpIF7sw]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ppe-000298][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [19241322ms]]; ], allocation_status[no_attempt]], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ppe-000298][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [20397ms]]; ], markAsStale [true]] [2023-09-08T09:28:30,607][WARN ][o.o.c.r.a.AllocationService] [master-2] failing shard [failed shard, shard [ppe-000298]

It seems segment replication event failed due to index corruption exception because of missing segment file
NoSuchFileException "/usr/share/opensearch/data/nodes/0/indices/rIJ86tpXTIG4h-Cn_MoPRg/0/index/_7tvv.cfe" doesn't exist
and ShardLockObtainFailedException on shard 0.

{ "index": "ppe-000298", "shard": 0, "primary": false, "current_state": "unassigned", "unassigned_info": { "reason": "ALLOCATION_FAILED", "at": "2023-09-08T14:12:35.637Z", "failed_allocation_attempts": 5, "details": "failed shard on node [ylgTCi8VSk-iytumnaGxlg]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ppe-000298][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [17065425ms]]; ", "last_allocation_status": "no_attempt" }, "can_allocate": "no", "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes", "node_allocation_decisions": [ { "node_id": "17zuTMYtQ9KvKKNa7gm0Ig", "node_name": “\data-az1-1", "transport_address": “*.*.*.*:9300", "node_attributes": { "zone": "az1", "shard_indexing_pressure_enabled": "true" }, "node_decision": "no", "deciders": [ { "decider": "max_retry", "decision": "NO", "explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2023-09-08T14:12:35.637Z], failed_attempts[5], failed_nodes[[EDhutdeXT5W5luFLpIF7sw, ylgTCi8VSk-iytumnaGxlg]], delayed=false, details[failed shard on node [ylgTCi8VSk-iytumnaGxlg]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ppe-000298][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [17065425ms]]; ], allocation_status[no_attempt]]]" } ] }

Eventually replica shards fall behind primary for too long and huge lagging.

Screenshots
index shard prirep state docs store ip node
ppe-000298 1 p STARTED 33768999 31.4gb 10...* data-az2-3
ppe-000298 1 r STARTED 1412441 1.3gb 10...* data-az1-4
ppe-000298 2 p STARTED 33763101 35.3gb 10...* data-az1-6
ppe-000298 2 r STARTED 5928658 5.1gb 10...* data-az2-2
ppe-000298 0 p STARTED 33758088 30.1gb 10...* data-az2-1
ppe-000298 0 r UNASSIGNED

Host/Environment (please complete the following information):

  • OS: Linux
  • Version: 2.8.0

We have tried to manually reroute the shard allocation but that didn't help.

Metadata

Assignees

No one assigned

    Labels

    IndexingIndexing, Bulk Indexing and anything related to indexingIndexing:ReplicationIssues and PRs related to core replication framework eg segrepbugSomething isn't workingv2.11.0Issues and PRs related to version 2.11.0

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions