Skip to content

Conversation

@ymao1
Copy link
Contributor

@ymao1 ymao1 commented Oct 15, 2020

Resolves #79165

Summary

Uses updateByQuery to mark non-recurring tasks that have exceeded its maxAttempts as failed instead of leaving them in a running state.

Previously, query portion of the updateByQuery included a clause that searched for tasks with a schedule or tasks where attempts < maxAttempts. Moved this check from the query into the painless script. Tasks that don't meet this condition (non-recurring tasks that have exceeded its maxAttempts) are updated to failed while other tasks are claimed as normal.

Checklist

Delete any items that are not applicable to this PR.

@ymao1 ymao1 requested a review from gmmorris October 15, 2020 15:16
@ymao1 ymao1 changed the title Task manager/mark task as failed [Task Manager] Mark task as failed if maxAttempts has been met. Oct 21, 2020
@ymao1 ymao1 added Feature:Task Manager release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v7.11.0 v8.0.0 labels Oct 21, 2020
@ymao1 ymao1 marked this pull request as ready for review October 21, 2020 11:58
@ymao1 ymao1 requested a review from a team as a code owner October 21, 2020 11:58
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@ymao1 ymao1 self-assigned this Oct 21, 2020
@ymao1
Copy link
Contributor Author

ymao1 commented Oct 26, 2020

@elasticmachine merge upstream

@mikecote mikecote self-requested a review October 26, 2020 12:57
Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM! Tried it locally and I can see the tasks marked as failed whenever retryAt is passed and attempts exceeds the configuration 👍

Comment on lines -270 to -273
shouldBeOneOf<ExistsFilter | TermFilter | RangeFilter>(
TaskWithSchedule,
...tasksWithRemainingAttempts
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ this is going to make the query soo much smaller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:elasticheart:

Copy link
Contributor

@YulNaumenko YulNaumenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

createTaskRunner: () => ({ run: () => Promise.resolve() }),
},
});
const claimTasksById = undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it could be just array of strings like:
const claimTasksById: string[] = [];

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@gmmorris gmmorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoop whoop, works as expected, good job 👍

@ymao1 ymao1 merged commit e6ab812 into elastic:master Oct 27, 2020
ymao1 added a commit to ymao1/kibana that referenced this pull request Oct 27, 2020
…tic#80681)

* wip

* Adding updateFieldsAndMarkAsFailed function

* Updating UBQ

* Only updating retryAt if marking as claiming

* Updating query

* Updating query to only fail one time tasks that have exceeded max attempts

* Fixing tests

* Fixing tests

* Handling claiming tasks by id

* Removing unused function

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
ymao1 added a commit that referenced this pull request Oct 27, 2020
…) (#81750)

* wip

* Adding updateFieldsAndMarkAsFailed function

* Updating UBQ

* Only updating retryAt if marking as claiming

* Updating query

* Updating query to only fail one time tasks that have exceeded max attempts

* Fixing tests

* Fixing tests

* Handling claiming tasks by id

* Removing unused function

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Oct 27, 2020
* master:
  [Security Solution][Endpoint][Admin] Malware Protections Notify User Version (elastic#81415)
  Revert "[Actions] Adding `hasAuth` to Webhook Configuration to avoid confusing UX (elastic#81390)"
  [Maps] Use default format when proxying EMS-files (elastic#79760)
  [Discover] Deangularize context.html (elastic#81442)
  Use the displayName property in default editor (elastic#73311)
  adds bug label to Bug report for Security Solution template (elastic#81643)
  [ML] Transforms: Remove index field limitation for custom query. (elastic#81467)
  [Actions] Adding `hasAuth` to Webhook Configuration to avoid confusing UX (elastic#81390)
  [Task Manager] Mark task as failed if maxAttempts has been met. (elastic#80681)
  [Lens] Fix URL query loss on redirect (elastic#81475)
  Log reason for 404 in field existence check (elastic#81315)
gmmorris added a commit to gmmorris/kibana that referenced this pull request Oct 27, 2020
* master: (87 commits)
  [Actions] Adding `hasAuth` to Webhook Configuration to avoid confusing UX (elastic#81778)
  [i18n] add get_kibana_translation_paths tests (elastic#81624)
  [UX] Fix search term reset from url (elastic#81654)
  [Lens] Improved range formatter (elastic#80132)
  [Resolver] `SideEffectContext` changes, remove `@testing-library` uses (elastic#81077)
  [Time to Visualize] Update Library Text with Call to Action (elastic#81667)
  [docs] Resolving failed Kibana upgrade migrations (elastic#80999)
  [ftr/menuToggle] provide helper for enhanced menu toggle handling (elastic#81709)
  [Task Manager] adds basic observability into Task Manager's runtime operations (elastic#77868)
  Doc changes for stack management and grouped feature privileges (elastic#80486)
  Added functional test for alerts list filters by status, alert type and action type. Did a code refactoring and splitting for alerts tests. (elastic#81422)
  [Security Solution][Endpoint][Admin] Malware Protections Notify User Version (elastic#81415)
  Revert "[Actions] Adding `hasAuth` to Webhook Configuration to avoid confusing UX (elastic#81390)"
  [Maps] Use default format when proxying EMS-files (elastic#79760)
  [Discover] Deangularize context.html (elastic#81442)
  Use the displayName property in default editor (elastic#73311)
  adds bug label to Bug report for Security Solution template (elastic#81643)
  [ML] Transforms: Remove index field limitation for custom query. (elastic#81467)
  [Actions] Adding `hasAuth` to Webhook Configuration to avoid confusing UX (elastic#81390)
  [Task Manager] Mark task as failed if maxAttempts has been met. (elastic#80681)
  ...
@mikecote mikecote added release_note:fix and removed release_note:skip Skip the PR/issue when compiling release notes labels Dec 16, 2020
@ymao1 ymao1 deleted the task-manager/mark-task-as-failed branch February 4, 2021 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature:Task Manager release_note:fix Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v7.11.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task remains in "running" state when last attempt timed out

6 participants