-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment Variable Resource Detection in Kubernetes #195
Conversation
cc @Aneurysm9 @pmm-sumo @dmitryax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting it together!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for reference we currently use OTEL_RESOURCE_ATTRIBUTES
in the operator
If there were different recognized variables we would probably switch to it though, as far as I can tell it's not practical to set that purely with YAML so having dedicated env vars seems worth it.
I think that the downward API can currently also expose labels and annotations. While it's not included in the semantic conventions, I am wondering if there's a space to include such capability via env resource detection. Currently, the labels/annotations are filtered by |
Oh, wow. I just realized you can define dependent env vars, which might be a better solution than i'm proposing. I could add: - name: OTEL_RESOURCE_ATTRIBUTES
value: k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID),k8s.namespace.name=$(K8S_NAMESPACE_NAME),k8s.node.name=$(K8S_NODE_NAME) to achieve the goals of this proposal, without requiring any detectors to be installed. I've added this as an alternative for now. |
@dashpole I would like to share my opinion. What do you think about configuring additional env var holding resource attributes related to k8s in separated var like: - name: OTEL_RESOURCE_ATTRIBUTES_K8S
value: k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID),k8s.namespace.name=$(K8S_NAMESPACE_NAME),k8s.node.name=$(K8S_NODE_NAME) and use it this way: - name: OTEL_RESOURCE_ATTRIBUTES
value: $(OTEL_RESOURCE_ATTRIBUTES_K8S),other=attribs... I think it will be simpler, transparent and easier to handle and modify. |
If we went with the current proposal (explicitly defined env vars that are detected with a kubernetes detector), we could also support If we used dependent environment variables, as we are discussing above (no detector needed), users could just fetch labels and attributes themselves:
There are a lot of labels/attributes that would come in if the WDYT? |
That seems like a perfectly reasonable thing to do, practically speaking. In terms of this proposal, I think the core question is whether or not we want to specify a new resource detector. I'd prefer to keep the yaml simpler for readability, if thats acceptable. |
|
||
Many kubernetes detectors currently use `HOSTNAME` environment variable, which defaults to the Pod name. However, the `HOSTNAME` can be [modified in a few ways](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields) in the pod spec. Kubernetes resource detectors may fall back to detecting the pod name using `HOSTNAME` if `K8S_POD_NAME` is not available, but this may cause user confusion in some cases. | ||
|
||
### Alternative: Using Dependent Environment Variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan
@jsuereth mentioned to me that this approach may not interact well with schemas, since OTEL_RESOURCE_ATTRIBUTES is merged into resource attributes without any notion of version according to the specification. Using this approach would seem to remove the benefits of #161. Is there a way to make OTEL_RESOURCE_ATTRIBUTES produce resource attributes associated with a particular version of the semantic conventions?
|
||
SDK's should add a Kubernetes environment variable-based resource detector. The Kubernetes resource detector and the collector's `resourcedetectionprocessor` should support the following environment variables, and map them to the following semantic conventions: | ||
|
||
| Environment Variable | Semantic Convention | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these environment variable names will run into conflicts. Should we prefix these with OTEL_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion. Is there precedent from other detectors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a precedent. Since SDK resource detectors would need to interpret them, would you consider these SDK environment variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely an instrumented application would need these vars from downward API as well, but it's very unlikely that they would expect any different value. So conflict here is rather useful, otherwise they would need to repeat all these definitions with OTEL_
prefix and without.
The question here is whether we want to support an application that uses K8S_POD_NAME
to represent a different value like deployment name or something else.
If we want to read these env vars from instrumentation libraries as is, we can probably support both prefixed (OTEL_K8S_POD_NAME
) and not prefixed, defined as preferred (K8S_POD_NAME
) just to not block use cases like the one described above. If we are still going to use only OTEL_RESOURCE_ATTRIBUTES
in instrumentation libraries and compose it with k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID)
, I think we shouldn't use any prefix, users always can rename them if really needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more, I think we should switch to OTEL_ prefixed env vars. The reason isn't that someone might want opentelemetry to attach attributes that don't match your own identity. This is mostly relevant for container name (e.g. sidecar), but I could also imagine it for pod or even namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dashpole do you mean that OTEL_K8S_CONTAINER_NAME
env var added to one container would mean something else other than a name of the container which it's added to, like name of an otel sidecar when added to an application container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thats what I mean. container name isn't part of this proposal, but would be a likely future extension of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO in that case meaning of OTEL_K8S_CONTAINER_NAME
becomes a bit confusing and inconsistent with other OTEL_K8S_*
env vars. I, as a user, would expect OTEL_K8S_CONTAINER_NAME
to have a value of the container where env var is defined, giving that other OTEL_K8S_*
have values of other k8s objects where the env var is defined.
I think we should use another name for the sidecar use case, not OTEL_K8S_CONTAINER_NAME
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think multiple env vars for the same semantic convention would work, as resource detectors will ultimately have to choose one of the two to set in that resource attribute.
If we named it without a prefix, I'm not sure that would stop users from treating the env vars as a way to set attribute X in OTEL, even if we would prefer they use another mechanism in that scenario. We just risk introducing naming conflicts if we don't prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal is related to open-telemetry/opentelemetry-specification#2135 which proposes support for OTEL_RESOURCE_ATTRIBUTES_*
env var. These env vars are added to OTEL_RESOURCE_ATTRIBUTES`
e.g.
OTEL_RESOURCE_ATTRIBUTES_K8S_POD_NAME=myapp
OTEL_RESOURCE_ATTRIBUTES=foo=bar
the final value of OTEL_RESOURCE_ATTRIBUTES
would be foo=bar,k8s_pod_name=myapp
@dashpole very nice OTEP, please consider to also refer to that proposal of supporting |
Done. |
On the subject of the difficulty of knitting together several environment variables into a final "OTEL_RESOURCE_ATTRIBUTES" variable, please see open-telemetry/opentelemetry-specification#1982. Here's how I handled this in a kustomization I wrote recently, with the following in the "base" manifest for a Deployment: - name: _OTEL_RESOURCES_ATTRIBUTES_UNDERLAY
value: ""
- name: _OTEL_RESOURCES_ATTRIBUTES_OVERLAY
value: ""
- name: OTEL_RESOURCE_ATTRIBUTES
value: >-
$(_OTEL_RESOURCES_ATTRIBUTES_UNDERLAY)
k8s.container.name=my-thing,
k8s.deployment.name=something,
k8s.namespace.name=$(POD_NAMESPACE),
k8s.node.name=$(NODE_NAME),
k8s.pod.name=$(POD_NAME),
k8s.pod.primary_ip_address=$(POD_IP_ADDRESS),
k8s.pod.service_account.name=$(POD_SERVICE_ACCOUNT_NAME),
k8s.pod.uid=$(POD_UID),
net.host.ip=$(NODE_IP_ADDRESS)
$(_OTEL_RESOURCES_ATTRIBUTES_OVERLAY) Note the two empty "_OTEL_RESOURCES_ATTRIBUTES_UNDERLAY" and "_OTEL_RESOURCES_ATTRIBUTES_OVERLAY" variables. In overlay kustomizations, I can populate either or both of those, but I have to be careful to terminate the former with a trailing comma and prefix the latter with a leading comma. |
Thanks @seh, i've added that to the drawbacks of using dependent environment variables. |
Co-authored-by: Dmitrii Anoshin <anoshindx@gmail.com>
This OTEP proposes using a consistent set of environment variables to detect resource attributes available through the Kubernetes downward API. The Kubernetes downward API is designed to serve this function and is the most consistent way to detect Kubernetes resource attributes. This will improve user experience, and enable re-use of the same Kubernetes configuration across languages and when using the collector.