-
-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine performance metrics: Prometheus and Grafana #7680
Comments
The current schema of the data we are storing would be helpful for this issue ahead of time. Essentially what we are storing in #7666 We are already documenting something like this for existing metrics, see https://distributed.dask.org/en/latest/prometheus.html What is missing on the documentation page is information about
|
Right now, it's not entirely certain if we want to expose this as a timeseries so there is also room to simply attach this information to a |
@hendrikmakait @crusaderky I believe there was a meeting discussing this topic. Can either of you summarize the outcome of it here? Ok, I just realize there is a summary here #7665 (comment) Does this mean we will not put those metrics in prometheus for now? If so, I suggest to close this issue |
I'd be fine with closing this ticket as not planned for now and reopening it at a later point in time. Providing E2E aggregates should be the priority for now. |
Agreed |
Export the metrics collected on the scheduler by #7666 to Prometheus.
For the sake of keeping everything additive, execute, gather-dep, get-data, and async spilling metrics should be kept separate.
Coiled-specific deliverable: implement new Grafana plots that use the above metrics.
The text was updated successfully, but these errors were encountered: