-
Notifications
You must be signed in to change notification settings - Fork 8.5k
Description
This code in the alerting migration looks a bit suspect to me:
kibana/x-pack/plugins/alerting/server/saved_objects/migrations.ts
Lines 84 to 103 in 15ab9d7
| function executeMigrationWithErrorHandling( | |
| migrationFunc: SavedObjectMigrationFn<RawAlert, RawAlert>, | |
| version: string | |
| ) { | |
| return (doc: SavedObjectUnsanitizedDoc<RawAlert>, context: SavedObjectMigrationContext) => { | |
| try { | |
| return migrationFunc(doc, context); | |
| } catch (ex) { | |
| context.log.error<AlertLogMeta>( | |
| `encryptedSavedObject ${version} migration failed for alert ${doc.id} with error: ${ex.message}`, | |
| { | |
| migrations: { | |
| alertDocument: doc, | |
| }, | |
| } | |
| ); | |
| } | |
| return doc; | |
| }; | |
| } |
It appears that if an ESO decrypt error occurs during the execution of the migration of rules, the code falls into the catch logic, and so the error is logged, but the document in returned unchanged - not actually migrated. We've seen that logged message now in a user's deployment.
I believe the proper course of action in this case is to "fix" the decryption error by removing the API key, essentially disabling the rule, which should allow it to be migrated. This has the unfortunate effect of disabling a rule that was by definition not disabled before - but then presumably it wasn't running either, since it has a decrypt error. The other option is to not migrate it during normal migration, but instead migrate it later when it's actually needed - but that seems a lot harder.
I guess the other case where this can happen is if the encryption key is changed at the same time the migration takes place, in which case every rule and connector would be broken because of decryption errors. That almost seems worthy of failing the migration completely, but it's also not clear how we recognize that state compared to just a handful of "broken" ESOs.
Note that we can't simply replace the API key in the rule, since that requires a user context to create the API key, and during migration, we won't have one. We'd have to remove it, by "disabling" the rule.
The actions migration is very similar:
kibana/x-pack/plugins/actions/server/saved_objects/migrations.ts
Lines 56 to 75 in b740640
| function executeMigrationWithErrorHandling( | |
| migrationFunc: SavedObjectMigrationFn<RawAction, RawAction>, | |
| version: string | |
| ) { | |
| return (doc: SavedObjectUnsanitizedDoc<RawAction>, context: SavedObjectMigrationContext) => { | |
| try { | |
| return migrationFunc(doc, context); | |
| } catch (ex) { | |
| context.log.error<ActionsLogMeta>( | |
| `encryptedSavedObject ${version} migration failed for action ${doc.id} with error: ${ex.message}`, | |
| { | |
| migrations: { | |
| actionDocument: doc, | |
| }, | |
| } | |
| ); | |
| } | |
| return doc; | |
| }; | |
| } |
Remediation is worse though, since we'll lose the secrets. But the good news is we do have a story for "actions that are missing secrets via export/import" - which is presumably how we'd set these up after the migration completes.
One thing I'm not sure about, is how to get this to work with the current ESO migration. It's a framework call, we just pass in functions to morph the raw saved object shapes, we don't have any way of handling decrypt errors differently than anything else (and what do we do when "anything else" errors occur!?). Feels like we'll need to beef up the ESO createMigration() function with some kind of error handling function, that would let us do some shape changes on the raw objects (to disable the rule, or mark the connector as "needs secrets"), and continue with the migration instead of failing.