-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add custom path-specific metrics proposal #12107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
ebc88a3
to
c162255
Compare
This proposal is trying to do a lot of things at once imo. Some thoughts:
I believe this use-case is better suited for runtime monitoring tools (see https://gvisor.dev/blog/2022/08/01/threat-detection/). I am also skeptical about introducing a configuration yaml file via
If the goal is to debug various filesystem types implemented in gVisor, I think pprof should be sufficient since it will be able to break down CPU time spent in different code paths, for example. |
+1 that the use-case of monitoring specific files or directories for access should be part of the gVisor runtime monitoring system, not the metrics system. If there is something to expand on the metric system, perhaps that could be a |
Thank you for the thoughtful feedback! You've raised excellent points that help refine this proposal. Let me address each concern:
TLDR:
|
The runtime monitoring system is currently designed as a real-time monitoring system mostly for threat detection and autonomous response to events happening within the sandbox, so one of its design goals is to be real-time, and this is also why it has high CPU usage. So rate-limiting would go against its current implementation. It may be possible to make it configurable to not act this way, i.e. to have it send event information in batches instead. That would increase its CPU efficiency, at the cost of losing real-time-ness. For your purposes of monitoring path-specific metrics, that seems like a worthwhile tradeoff.
No silver bullet. In the gVisor metrics system (not runtime monitoring), there are already some metrics like: gvisor/pkg/sentry/vfs/file_description.go Line 642 in d6ba994
This counts the number of reads across all file descriptors. But it is not filesystem-implementation-specific, so it can't determine the Lines 225 to 232 in d6ba994
But again, for your use-case you likely want to add instrumentation in the runtime monitoring subsystem, not in metrics. The metrics system is high-performance but very limited; for example, string fields must pre-declare all of their possible values, so it would be impossible to use it for path-specific field values as those would only be known at runtime. |
Hi @EtiennePerot , How do you feel about this solution: |
That sounds good to me, so long as this is configurable and that the default behavior is still the current real-time behavior (no batching), so that existing users of runtime monitoring which rely on its real-time-ness maintain such behavior. cc @fvoznika for confirmation. |
Custom Path-Specific Metrics for GVisor
Summary
This proposal introduces path-specific metrics to GVisor's existing metrics system, enabling fine-grained monitoring of filesystem access patterns for specific
directories or mount points.
Motivation
Current GVisor metrics provide excellent general visibility into filesystem operations, but lack the granularity needed to understand application behavior at the path
level. This limitation makes it difficult to:
Proposed Solutions
The proposal outlines two complementary approaches:
Both solutions use a --path-metrics-config flag with YAML configuration files, making them easy to deploy and maintain while providing the observability needed for
production environments.
Benefits
Implementation Approach
The proposal is designed to be:
This enhancement would significantly improve GVisor's observability capabilities while maintaining its security and performance characteristics.
Note:
This is a proposal document for discussion. I'm interested in feedback on the approach and would be happy to collaborate on the implementation if there's interest
from the maintainers.