-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a scoped utility to monitor cumulative duration and iterations #6501
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea!
crates/mysten-metrics/src/lib.rs
Outdated
|
||
#[macro_export] | ||
macro_rules! monitored_scope { | ||
($name: expr) => {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a 'name', perhaps we don't even need location at this point?
Locations change as code evolve, so they are not going to be used in dashboards or anything.
Even for monitored_futures, etc, I would prefer to just go with explicit name instead of deriving location from file/line
And cherry on top is that if we don't need location then this entire thing can turn into regular function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I removed the location label for monitored_scope, and converted the macro into a function.
502b3f1
to
86c9a26
Compare
…6501) This is a generalization of TaskUtilizationExt that we use in a few places. There are a few select loops and sections guarded by mutex that may become the bottleneck of the system. It is useful to monitor how "loaded" they are, and if increased load comes from increased number of calls. In future we can also record the number of times where a call is particularly long (> 1s or a user specified threshold).
This is a generalization of
TaskUtilizationExt
that we use in a few places. There are a few select loops and sections guarded by mutex that may become the bottleneck of the system. It is useful to monitor how "loaded" they are, and if increased load comes from increased number of calls. In future we can also record the number of times where a call is particularly long (> 1s or a user specified threshold).@andll if you would like to, I can update the other callsites of
TaskUtilizationExt
and update the dashboards. Or I can revert the refactor in handle_consensus_transaction() etc.As @mystenmark suggested this can also be implemented via trace spans. But the API is more involved, and I'm a bit worried about performance regression in tracing hurting hot loop performance. We can revisit if needed.