-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add container resource detector. #340
feat: add container resource detector. #340
Conversation
resource_detectors/lib/opentelemetry/resource/detectors/container.rb
Outdated
Show resolved
Hide resolved
…support for cgroup v1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly LGTM. I just had a question about cgroup v2 this time. I'm not requesting changes, but I'd like to clarify that the strategy we're using is at least as reliable as others I've seen.
parts.shift | ||
|
||
parts.each do |part| | ||
return part if part.match?(CONTAINER_REGEX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other implementations that I have seen specifically look for a line containing "hostname"
to extract the container id from. For example:
However, looking at your test, and inspecting /proc/self/mountinfo
on some containers running locally, it looks like this should work just fine. Do you know of any other implementations that use this strategy? I'm curious if they know something we don't, or if there are multiple solutions that work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your re-review @mwear.
My guess would be that filtering on "hostname"
is done to limit the amount of regexmatching?
I initially included the filter on "hostname"
as well; but decided to remove that; since it cannot be used reliably. e.g. when a user set the hostname
themselves in kubernetes or swarm; we would not be able to determine the container.id
(although I think setting the "hostname"
will probably be an edge case).
If we decide that we don't want to filter on "hostname"
before regex matching; we might also remove container_id_v1
; since def container_id_v2
will also work for cgroupv1
- although I still want to benchmark that before doing so - to keep the code a bit simpler. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference. I am most interested in the detector reliably and consistently detecting the container id. As far as performance goes, detecting the container id will be something that happens once at startup. While we don't want to give performance away unnecessarily, this is not a hot code path, so favoring simplicity is reasonable thing to do here. Let me know when you're happy with everything and I'll review it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've simplified the code a bit; according to me this should do the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks @scbjans.
@robertlaurin could we get an additional set of eyes on this when you get a chance? |
@mwear @robertlaurin from the SIG I understood that the plan was to split the resource detectors gem into separate gems. That totally makes sense to me; and I'll rework this PR a bit to reflect that. Once that's done and approved, I'm happy to also split of the other two into separate gems. One point on which I would like advice or some directions is the addition of other container resource attributes (see PR description, and - currently outdated - image). It'll be helpful to add the other container resource attributes as well; but I do not want to promote running the container as root. Do you perhaps have another idea on how I could add them? |
# @return [String] container.id | ||
# May be nil. | ||
def container_id | ||
[CGROUP_V2_PATH, CGROUP_V1_PATH].each do |cgroup| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would emitting some kind of warning or error message be helpful to folks trying to debug why this resource detector is unable to read the cgroup
files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally my idea was that we do not want to flood the logs if someone uses this gems Autodetector.detect
, but since we're going to split the resource detectors up into separate gems, that reasoning is no longer valid.
I'll add some logs once I get to splitting this up into it's own gem.
# If no container attributes could be determined an empty resource is returned | ||
# | ||
# @return [OpenTelemetry::SDK::Resources::Resource] | ||
def detect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely certain what the spec says about error handling in Resource detectors but I recommend you consider adding error handling with diagnostic information that will clue folks into why this resource detector may be failing and avoid raising errors that would cause the application to crash at boot time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do so once I split this of into it's own separate gem (see comment above). However, I do believe that currently this resource detector will not crash if it cannot determine the container.id
?
next unless File.readable?(cgroup) | ||
|
||
File.readlines(cgroup, chomp: true).each do |line| | ||
parts = line.split('/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some nuance I am missing here. Does scanning the full string result in false positives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scanning the full string works for cgroupv1; but I couldn't get it to work for cgroupv2. This way it's compatible with both
The |
Updated PR to only filter out lines including an actual container id. Previously I ran into problems because storage ids also matched. As a bonus I've included a test so that it should also work for Podman (cgroupfile copied from the java instrumentation). |
Regarding adding the other container resource attributes: I've learned that I can use the otelcols k8sattributesprocessor to fix that. |
@robertlaurin is this how you envisioned splitting up the resource detectors into it's own seperate gem? If so, I'm happy to do it for the other two resource detectors (Azure and GCP) as well. |
@mwear @ericmustin anything left here to do before merging? |
@arielvalentin lgtm |
@ericmustin would you mind merging and releasing? |
👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the |
@mwear @ericmustin @arielvalentin anything I can do to get this merged? |
CI is broken; but there's already a PR which fixes that: #581 |
This PR adds a resource detector for Containers.
It will add the container ID if possible.
For containers running on a Docker runtime, it will also try to add* container.image.name* container.image.tag* container.name* container.runtimebut that depends on the availability of a readable Docker Socket.I've decided to remove setting additional container resource attributes and only support setting the
container.id
.Reasoning behind that:
docker.sock
/containerd.sock
, which requires correct - probably most of the time equals to elevated - permissions. Most easy way to do that is of course running the container asroot
. I do not want to promote that.docker.sock
using TLS certs; but that would make auto-instrumentation a lot less autocontainer.id
Fixes: #179