Closed
Description
Elasticsearch Version
8.3.3
Installed Plugins
No response
Java Version
bundled
OS Version
Deployment in ESS
Problem Description
.fleet-actions-results
data stream cannot be restored via the fleet
feature state.
Consider the following scenario (observed in the field in ESS):
- Due to unforeseen situation, cluster becomes red with the following red indices:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size sth
green open .ds-.fleet-actions-results-2022.05.04-000002 eZO3mXu3RYOZpygHvC2dgQ 1 1 0 0 450b 225b false
red open .ds-.fleet-actions-results-2022.06.03-000003 iBbSWmHaQbqJFn_aBVqaYg 1 1 false
red open .ds-.fleet-actions-results-2022.07.03-000004 sF3-S4uoQkybpm7ujaZBVg 1 1 false
red open .ds-.fleet-actions-results-2022.08.02-000006 t-U-Wrd_RpqZUqSS2a3TqA 1 1 false
red open .fleet-actions-7 8zgOKVzdQIeS_YGq_JX--w 1 1 false
red open .fleet-agents-7 p7sWhvhPRaWQ_unOHIJQTQ 1 1 false
red open .fleet-artifacts-7 iingfeghRJ2bfqLAGFt0Aw 1 1 false
red open .fleet-enrollment-api-keys-7 8J1tyEuJSfyhMxf5HsfU2A 1 1 false
red open .fleet-policies-7 HufDBhgBQraUYlNosY1ysg 1 1 false
red open .fleet-policies-leader-7 jpqhCaF9SL-S0AjlWqa6xg 1 1 false
red open .fleet-servers-7 5xdgNy-kSXSdsWZbM8mRHw 1 1 false
- User attempts to restore the
fleet
feature state using the following restore snapshot API:
POST _snapshot/found-snapshots/cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/_restore?wait_for_completion=false
{
"indices": "-*",
"ignore_unavailable": "true",
"include_global_state": "false",
"include_aliases": "false",
"feature_states": [
"fleet"
]
}
- Above API fails with the following error:
{
"error": {
"root_cause": [
{
"type": "snapshot_restore_exception",
"reason": "[found-snapshots:cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/H3i28HlrSiKyrLaiDCE6uA] cannot restore index [.ds-.fleet-actions-results-2022.06.03-000003] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name"
}
],
"type": "snapshot_restore_exception",
"reason": "[found-snapshots:cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/H3i28HlrSiKyrLaiDCE6uA] cannot restore index [.ds-.fleet-actions-results-2022.06.03-000003] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name"
},
"status": 500
}
- Checking the
fleet
feature state, it seems that theSystemIndexDescriptor
(c.f code) does contain the.fleet-actions-results-*
pattern. A couple of guesses about the reported problem:
- The implementation only considers regular indices and not data streams?
- The implementation considers the data stream but fails to close the backing indices before restoring them?
Steps to Reproduce
- Create a cluster version 8.3.3 and deploy an Elastic Agent with the Osquery Manager integration.
- Run a new live Osquery.
- Observe that the
.fleet-actions-results
data stream is created with the respective backing indices. - Restore the
fleet
feature state using the restore snapshot API and observe the same error as above.
Workaround
- Create
fleet_superuser
role
POST _security/role/fleet_superuser
{
"indices": [
{
"names": [
".fleet*"
],
"privileges": [
"all"
],
"allow_restricted_indices": true
}
]
}
- Create
temp_user
user withsuperuser
,fleet_superuser
roles:
POST _security/user/temp_user
{
"password": "temp_password",
"roles": [
"superuser",
"fleet_superuser"
]
}
- Close
.fleet-actions-results
backing indices using the below cURL command:
curl -k -XPOST --user temp_user:temp_password -H 'x-elastic-product-origin:fleet' https://$CLUSTER_ADDRESS/.ds-.fleet-actions-results-2022.05.04-000002,.ds-.fleet-actions-results-2022.06.03-000003,.ds-.fleet-actions-results-2022.07.03-000004,.ds-.fleet-actions-results-2022.08.02-000006/_close
Note: for users running the cURL command on Windows, make sure to use double quotes instead for the header: "x-elastic-product-origin:fleet"
- Restore fleet feature state:
POST _snapshot/found-snapshots/cloud-snapshot-2022.08.08-lywsv4teqe-zj3ygvjkria/_restore?wait_for_completion=false
{
"indices": "-*",
"ignore_unavailable": "true",
"include_global_state": "false",
"include_aliases": "false",
"feature_states": [
"fleet"
]
}
- Delete
temp_user
user
DELETE _security/user/temp_user
- Delete
fleet_superuser
role
DELETE _security/role/fleet_superuser
Logs (if relevant)
No response