-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[Alerting] Handle when an Alerting Task fails due to its Alert object being deleted mid flight #63093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting] Handle when an Alerting Task fails due to its Alert object being deleted mid flight #63093
Conversation
|
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
YulNaumenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
mikecote
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one comment
pmuellr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@elasticmachine merge upstream |
* master: document code splitting for client code (elastic#62593) Escape single quotes surrounded by double quotes (elastic#63229) [Endpoint] Update cli mapping to match endpoint package (elastic#63372) update in-app links to metricbeat configuration docs (elastic#63295) investigation notes field (documentation / metadata) (elastic#63386) [Maps] fix bug where toggling Scaling type does not re-fetch data (elastic#63326) [Alerting] set correct parameter for unauthented email action (elastic#63086) [Telemetry] force staging urls in tests (elastic#63356) Migrate legacy maps service to NP & update refs (elastic#60942) Fix task manager query to return tasks to retry (elastic#63360) [Endpoint] Policy list support for URL pagination state (elastic#63291) [Canvas] Migrate saved object mappings and migrations to Kibana Platform (elastic#58891) [DOCS] Add ILM tutorial (elastic#59502) [Maps] Add SOURCE_TYPES enumeration (elastic#62975) [Maps] update geospatial filters to use geo_shape query for geo_point fields (elastic#62966) Move away from npStart for embeddables in canvas (elastic#62680)
* master: [Event Log] Adds namespace into save objects (elastic#62974)
mikecote
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes LGTM
pmuellr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still LGTM - made a note about trying to get that SO helper function enhanced so we don't have to grep through the error message
| import { SavedObjectsErrorHelpers } from '../../../../../src/core/server'; | ||
|
|
||
| export function isAlertSavedObjectNotFoundError(err: Error, alertId: string) { | ||
| return SavedObjectsErrorHelpers.isNotFoundError(err) && `${err}`.includes(alertId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels yucky enough that I feel like we should ask for a better helper - like being able to pass alertId into that isNotFoundError() function, rather than text searching the error message ourselves. The alertId values are UUIDs today, which seems safe enough to test for, but who knows what tomorrow holds - action id's can now be user-specified via pre-configured actions ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's fair, but as there seems to be pressure to get this backported to 7.7, I think we'll have to defer that.
* master: (29 commits) Add test:jest_integration npm script (elastic#62938) [data.search.aggs] Remove service getters from agg types (AggConfig part) (elastic#62548) [Discover] Fix broken setting of bucketInterval (elastic#62939) Disable adding conditions when in alert management context. (elastic#63514) [Alerting] fixes to allow pre-configured actions to be executed (elastic#63432) adding useMemo (elastic#63504) [Maps] fix double fetch when filter pill is added (elastic#63024) [Lens] Fix missing formatting bug in "break down by" (elastic#63288) [SIEM] [Cases] Removed double pasted line (elastic#63507) [Reporting] Improve functional test steps (elastic#63259) [SIEM][CASE] Tests for server's configuration API (elastic#63099) [SIEM] [Cases] Case container unit tests (elastic#63376) [ML] Improving parsing of large uploaded files (elastic#62970) [ML] Listing global calendars on the job management page (elastic#63124) [Ingest][Endpoint] Add Ingest rest api response types for use in Endpoint (elastic#63373) Add help text to form fields (elastic#63165) [ML] Converts utils Mocha tests to Jest (elastic#63132) [Metrics UI] Refactor With* containers to hooks (elastic#59503) [NP] Migrate logstash server side code to NP (elastic#63135) Clicking cancel in saved query save modal doesn't close it (elastic#62774) ...
💚 Build SucceededHistory
To update your PR or re-run it, just comment with: |
… being deleted mid flight (elastic#63093) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
… being deleted mid flight (elastic#63093) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
|
We didn't make it to 7.7, so this will be backported for the 7.7.1 patch. |
… being deleted mid flight (#63093) (#63564) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted). Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
… being deleted mid flight (#63093) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
|
Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync. |
… being deleted mid flight (#63093) (#63569) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted). Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
… being deleted mid flight (elastic#63093) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted). # Conflicts: # x-pack/plugins/alerting/server/task_runner/task_runner.test.ts # x-pack/plugins/alerting/server/task_runner/task_runner.ts
… being deleted mid flight (#63093) (#66570) Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings. Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks). Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted). # Conflicts: # x-pack/plugins/alerting/server/task_runner/task_runner.test.ts # x-pack/plugins/alerting/server/task_runner/task_runner.ts
Summary
Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.
Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).
Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
Closes #42477
Checklist
Delete any items that are not applicable to this PR.
Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n supportDocumentation was added for features that require explanation or tutorialsThis was checked for keyboard-only and screenreader accessibilityThis renders correctly on smaller devices using a responsive layout. (You can test this in your browserThis was checked for cross-browser compatibility, including a check against IE11For maintainers