Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8sattributes] Support the extraction of service level metadata in processor #20840

Closed
moh-osman3 opened this issue Apr 10, 2023 · 9 comments
Closed
Labels
closed as inactive enhancement New feature or request processor/k8sattributes k8s Attributes processor Stale

Comments

@moh-osman3
Copy link
Contributor

Component(s)

No response

Is your feature request related to a problem? Please describe.

Currently the k8sattributes processor extracts metadata labels that are associated with a pod object or a namespace object. This ticket is to request some metadata to be extracted for a service kube object. In particular k8s.service.name would be a great attribute to have available.

Describe the solution you'd like

k8sattributes processor adds functionality to watch kube service and store/extract metadata labels related to kube service.

Describe alternatives you've considered

No response

Additional context

No response

@moh-osman3 moh-osman3 added enhancement New feature or request needs triage New item requiring triage labels Apr 10, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@dmitryax
Copy link
Member

What if there are multiple services associated with the pod the data coming from?

@moh-osman3
Copy link
Contributor Author

@dmitryax So I think this is a problem similarly encountered in prometheus where a scrape target (e.g. a pod i.p.) is discovered during service discovery and some scrape targets might have multiple services associated with the pod endpoint. So then there ends up being duplicated timeseries, each with its own service label. Our workaround to avoid duplicates is using serviceMonitors to match for a specific service name. Unsure the best way to solve this but some initial thoughts on approaches

  1. Ignore pod's that have more than one service (don't make a choice and therefore k8s.service.name is empty). I think this same thing is currently done in processor for container labels if a pod has more than one container
  2. Use an arbitrary rule to select one of the services to populate the service label (i.e. last associated service in a list, or first alphabetically, etc). This information is incomplete, but at least some service information is getting to the consumer
  3. Add another field to allow the user to select which service label to prioritize (i.e. match for service with suffix .*-headless. This gives the user the power for how the label is populated, but this still might result in duplicate services.

Unsure how feasible these suggestions are to implement. Any thoughts on the best way to workaround this issue?

@dmitryax
Copy link
Member

dmitryax commented Apr 11, 2023

  1. Ignore pod's that have more than one service (don't make a choice and therefore k8s.service.name is empty). I think this same thing is currently done in processor for container labels if a pod has more than one container

Given that multiple services for one pod is not recommended use of k8s service, I believe we can go we this option. BTW not sure what you mean by container labels :)

@atoulme atoulme added processor/k8sattributes k8s Attributes processor and removed needs triage New item requiring triage labels Apr 12, 2023
@github-actions
Copy link
Contributor

Pinging code owners for processor/k8sattributes: @dmitryax @rmfitzpatrick. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@swiatekm
Copy link
Contributor

Keep in mind that this is technically more complex than other metadata we currently support in the processor. If you take, say, k8s.deployment.name, this is a value that is immutable from the moment it's set, so once we start adding it to data, we'll keep doing so. On the other hand, the relationship between Pods and Services is dynamic and maintained via Endpoints, which can come and go depending on the Pod status. So in a naive implementation, you can lose Service metadata for Pods which aren't in a Ready state, leading to all sorts of trouble, like new timeseries for metrics.

Given that multiple services for one pod is not recommended use of k8s service, I believe we can go we this option. BTW not sure what you mean by container labels :)

Is there some kind of official guidance on using multiple Services this way? I know of multiple relatively high-profile Helm Charts that create, for example, both a normal Service and a headless one for the same set of Pods. Some also create separate Services specifically for monitoring via ServiceMonitors.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 25, 2023
@CoderPoet
Copy link

CoderPoet commented Aug 29, 2023

Keep in mind that this is technically more complex than other metadata we currently support in the processor. If you take, say, k8s.deployment.name, this is a value that is immutable from the moment it's set, so once we start adding it to data, we'll keep doing so. On the other hand, the relationship between Pods and Services is dynamic and maintained via Endpoints, which can come and go depending on the Pod status. So in a naive implementation, you can lose Service metadata for Pods which aren't in a Ready state, leading to all sorts of trouble, like new timeseries for metrics.

Given that multiple services for one pod is not recommended use of k8s service, I believe we can go we this option. BTW not sure what you mean by container labels :)

Is there some kind of official guidance on using multiple Services this way? I know of multiple relatively high-profile Helm Charts that create, for example, both a normal Service and a headless one for the same set of Pods. Some also create separate Services specifically for monitoring via ServiceMonitors.

It feels like there are still scenes that need this feature. We currently have this scenario requirement:

If a client is accessing a service's ClusterIP, such as: podA->serviceB

At this time, we hope to be able to automatically associate the metadata attributes of the corresponding service according to this clusterIP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed as inactive enhancement New feature or request processor/k8sattributes k8s Attributes processor Stale
Projects
None yet
Development

No branches or pull requests

5 participants