Skip to content

Conversation

@gmmorris
Copy link
Contributor

@gmmorris gmmorris commented Apr 9, 2020

Summary

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).

Closes #42477

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@gmmorris gmmorris added Feature:Alerting v8.0.0 v7.7.0 Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v7.8.0 labels Apr 9, 2020
@gmmorris gmmorris requested a review from a team as a code owner April 9, 2020 09:43
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@gmmorris gmmorris added the release_note:skip Skip the PR/issue when compiling release notes label Apr 9, 2020
Copy link
Contributor

@YulNaumenko YulNaumenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one comment

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikecote
Copy link
Contributor

@elasticmachine merge upstream

elasticmachine and others added 6 commits April 13, 2020 15:05
* master:
  document code splitting for client code (elastic#62593)
  Escape single quotes surrounded by double quotes (elastic#63229)
  [Endpoint] Update cli mapping to match endpoint package (elastic#63372)
  update in-app links to metricbeat configuration docs (elastic#63295)
  investigation notes field (documentation / metadata) (elastic#63386)
  [Maps] fix bug where toggling Scaling type does not re-fetch data (elastic#63326)
  [Alerting] set correct parameter for unauthented email action (elastic#63086)
  [Telemetry] force staging urls in tests (elastic#63356)
  Migrate legacy maps service to NP & update refs (elastic#60942)
  Fix task manager query to return tasks to retry (elastic#63360)
  [Endpoint] Policy list support for URL pagination state (elastic#63291)
  [Canvas] Migrate saved object mappings and migrations to Kibana Platform (elastic#58891)
  [DOCS] Add ILM tutorial (elastic#59502)
  [Maps] Add SOURCE_TYPES enumeration (elastic#62975)
  [Maps] update geospatial filters to use geo_shape query for geo_point fields (elastic#62966)
  Move away from npStart for embeddables in canvas (elastic#62680)
* master:
  [Event Log] Adds namespace into save objects (elastic#62974)
@gmmorris gmmorris changed the title [Alerting] Avoid logging a warning when an in flight Alert task is deleted [Alerting] Handle when an Alerting Task fails due to its Alert object being deleted mid flight Apr 14, 2020
Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still LGTM - made a note about trying to get that SO helper function enhanced so we don't have to grep through the error message

import { SavedObjectsErrorHelpers } from '../../../../../src/core/server';

export function isAlertSavedObjectNotFoundError(err: Error, alertId: string) {
return SavedObjectsErrorHelpers.isNotFoundError(err) && `${err}`.includes(alertId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels yucky enough that I feel like we should ask for a better helper - like being able to pass alertId into that isNotFoundError() function, rather than text searching the error message ourselves. The alertId values are UUIDs today, which seems safe enough to test for, but who knows what tomorrow holds - action id's can now be user-specified via pre-configured actions ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fair, but as there seems to be pressure to get this backported to 7.7, I think we'll have to defer that.

* master: (29 commits)
  Add test:jest_integration npm script (elastic#62938)
  [data.search.aggs] Remove service getters from agg types (AggConfig part) (elastic#62548)
  [Discover] Fix broken setting of bucketInterval (elastic#62939)
  Disable adding conditions when in alert management context. (elastic#63514)
  [Alerting] fixes to allow pre-configured actions to be executed (elastic#63432)
  adding useMemo (elastic#63504)
  [Maps] fix double fetch when filter pill is added (elastic#63024)
  [Lens] Fix missing formatting bug in "break down by" (elastic#63288)
  [SIEM] [Cases] Removed double pasted line (elastic#63507)
  [Reporting] Improve functional test steps (elastic#63259)
  [SIEM][CASE] Tests for server's configuration API (elastic#63099)
  [SIEM] [Cases] Case container unit tests (elastic#63376)
  [ML] Improving parsing of large uploaded files (elastic#62970)
  [ML] Listing global calendars on the job management page (elastic#63124)
  [Ingest][Endpoint] Add Ingest rest api response types for use in Endpoint (elastic#63373)
  Add help text to form fields (elastic#63165)
  [ML] Converts utils Mocha tests to Jest (elastic#63132)
  [Metrics UI] Refactor With* containers to hooks (elastic#59503)
  [NP] Migrate logstash server side code to NP (elastic#63135)
  Clicking cancel in saved query save modal doesn't close it (elastic#62774)
  ...
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@gmmorris gmmorris merged commit d1134c5 into elastic:master Apr 15, 2020
gmmorris added a commit to gmmorris/kibana that referenced this pull request Apr 15, 2020
… being deleted mid flight (elastic#63093)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
gmmorris added a commit to gmmorris/kibana that referenced this pull request Apr 15, 2020
… being deleted mid flight (elastic#63093)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
@gmmorris gmmorris added v7.7.1 and removed v7.7.0 labels Apr 15, 2020
@gmmorris
Copy link
Contributor Author

We didn't make it to 7.7, so this will be backported for the 7.7.1 patch.
The workaround for 7.7 would be to manually delete these tasks, so hopefully that'll be good enough for the duration.

gmmorris added a commit that referenced this pull request Apr 15, 2020
… being deleted mid flight (#63093) (#63564)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
wayneseymour pushed a commit that referenced this pull request Apr 15, 2020
… being deleted mid flight (#63093)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Apr 16, 2020
@kibanamachine
Copy link
Contributor

Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync.

@mikecote mikecote added release_note:fix and removed release_note:skip Skip the PR/issue when compiling release notes labels Apr 16, 2020
gmmorris added a commit that referenced this pull request Apr 17, 2020
… being deleted mid flight (#63093) (#63569)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
@kibanamachine kibanamachine removed the backport missing Added to PRs automatically when the are determined to be missing a backport. label Apr 17, 2020
gmmorris added a commit that referenced this pull request Apr 17, 2020
…t object being deleted mid flight (#63093) (#63569)"

This reverts commit 3cdf259.
gmmorris added a commit that referenced this pull request Apr 17, 2020
…t object being deleted mid flight (#63093) (#63569)" (#63820)

This reverts commit 3cdf259.
gmmorris added a commit to gmmorris/kibana that referenced this pull request May 14, 2020
… being deleted mid flight (elastic#63093)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
# Conflicts:
#	x-pack/plugins/alerting/server/task_runner/task_runner.test.ts
#	x-pack/plugins/alerting/server/task_runner/task_runner.ts
gmmorris added a commit that referenced this pull request May 15, 2020
… being deleted mid flight (#63093) (#66570)

Detects if a task run failed due to the task SO being deleted mid flight and if so writes debug logs instead of warnings.

Detects if an Alerting task run failed due to the alert SO being deleted mid flight of the task and if so ensures the task doesn't reschedule itself (as it usually would with other types of tasks).

Ensures that the operation of deleting or disabling an Alert won't fail if it fails to delete an already deleted task (a task might preemptively self delete if its underlying alert object was deleted, even if the overall delete operation wasn't deleted).
# Conflicts:
#	x-pack/plugins/alerting/server/task_runner/task_runner.test.ts
#	x-pack/plugins/alerting/server/task_runner/task_runner.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature:Alerting release_note:fix Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v7.7.1 v7.8.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

extraneous error/warning messages when deleting alerts

6 participants