Description
affected version: Elasticsearch version 7.5.0
If a cluster is upgraded to 7.5.0 using rolling upgrade while a transform is in STARTED
state, it's possible that after the upgrade audit logging seems to disappear. Audit logs, called "job messages" in the Transform UI, are not shown due to a broken mapping. Because the broken mapping happens on the backend, this is a backend issue. For further details look below.
Note: This bug does not affect transform functionality.
Quick Fix
Solution 1
Upgrade to 7.5.1 when available. 7.5.1 permanently fixes the problem and creates .transform-notifications-000002
with proper mappings to repair broken 7.5.0 installations.
Solution 2
Delete the audit index, so it gets recreated automatically: curl -XDELETE "http://localhost:9200/.transform-notifications-000001"
or DELETE .transform-notifications-000001
(kibana dev console)
Note: audit logs for the time of the upgrade till the deletion of the audit index get lost. You might want to inspect the index and/or re-index it before deletion.
Details
The transform audit index uses an index template so that the audit index gets created at the 1st write. Due to the rename to transform the audit index got renamed from .data-frame-notifications-1
to .transform-notifications-000001
. The index template gets installed by a TemplateUpgradeService used in the plugin. The upgrade service however only upgrades templates when running on the master node. In a rolling upgrade scenario the master node might get upgraded in the middle or as it's suggested: last. In case a transform writes a audit message in a mixed cluster running on a node that is already upgraded to 7.5 the template might not be available yet, in which case the audit index gets created with defaults. Additionally the read alias used by the UI does not get installed. Due to the incompatible mapping however, it's not sufficient to only add the alias.
Code fix
For 7.5.1 we will ensure that the index template gets installed before a transform task can write to the audit index. When a job gets re-assigned to an upgraded node > 7.5.1 we already do a check for the transform internal index. This working solution needs to be applied to the audit index, too.