Better instrumentation for `Worker.gather_dep`

[Task queuing](https://github.com/dask/distributed/issues/7213) has been proven to significantly improve performance by reducing [root task overproduction](https://github.com/dask/distributed/issues/5555)

In recent benchmarks and tests I noticed that one major source for root task overproduction is not necessarily that reducers are not assigned fast enough to the workers but that the workers are unable to run these tasks since they need to fetch dependencies first. If average root task runtime is much smaller than it takes to fetch dependencies, this can cause workers to run many data producers before it has the possibility to run a reducer.

Right now, we're almost blind to this situation but could be exposing much better metrics on the dashboard (or Prometheus).

Specifically, I'm interested in

- How much time do tasks spend in the ready queue before they are worked on? Can we calculate averages on TaskGroups/Prefix? TaskGroups per Worker?
- How much time do tasks spend in any state, e.g. fetch. In general, how long are wait times in our queues?
- How long do gather_dep requests typically last, broken down per TaskGroup/Prefix?
- How much time on the gather_dep request are we spending on
  - Connection establishment (e.g. connection pool empty, remote event loop is blocked, handshake takes a while)
  - Data (de-)serialization
  - Spill-to-disk


Ideally, I would love to get data for a Task X with dependencies deps that tells me

```
X spent 10s in ready queue
-> 8s spent fetching data
  -> 1s connection
  -> 2s spill-to-disk
  -> 2s (de-)serialization
  -> 2s network transfer
  -> 1s idle / event loop busy
-> 2s spent waiting for open slot on the ThreadPool
```

Some of this information is already available, other information we still need to collect. I don't think we have anything that can break it up this way and/or group by TaskGroups or individual tasks.

I think this kind of visibility would help us significantly with making decisions about optimizations, e.g. should we prioritize STA? Should we focus on getting a sendfile implementation up and running? Do connection attempts take way too long because event loops are blocked?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Better instrumentation for `Worker.gather_dep` #7217

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Better instrumentation for Worker.gather_dep #7217

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Better instrumentation for `Worker.gather_dep` #7217