Fine performance metrics: Meter task re-execution after losing a worker

- Part of #7665
- Complements #7678 

When a worker dies and you lose tasks in `memory` state, they transition back to `released` on the scheduler and are re-computed somewhere else.

We would like to know how much time we spent re-computing tasks after a worker dies. This could inform the user e.g. to call `replicate()` on important data.

Add a boolean flag to the Compute message, stating that the task was previously in memory at some point and it's now being recomputed.
When the task ends successfully on the worker, instead of logging its granular metrics we will log a lump sum under the `("execute", <prefix>, "recompute", "seconds")` label. This is an equivalent treatment to when a task fails and we log a lump sum under the `("execute", <prefix>, "failed", "seconds")` label, which was introduced in #7586.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fine performance metrics: Meter task re-execution after losing a worker #7676

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Fine performance metrics: Meter task re-execution after losing a worker #7676

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions