Skip to content

Provide option to control Prometheus labels #1429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 22, 2016

Conversation

grobie
Copy link
Contributor

@grobie grobie commented Aug 20, 2016

This change generalizes the existing ContainerNameToLabelsFunc to allow the user to fully control all labels attached to exported Prometheus metrics. The existing behavior is available as DefaultContainerLabels and is used if no custom function is provided.

This will allow Kubernetes to filter out its internal Docker labels.

Fixes #1312. Once Kubernetes provides a custom function to filter its internal labels implemented in kubernetes/kubernetes#31064.

PTAL @jimmidyson @fabxc @vishh @timstclair

@vishh Tim and I discussed that the current label settings classify as bug and that this should be included in Kubernetes v1.4.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Aug 20, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

if image := container.Spec.Image; len(image) > 0 {
set["image"] = image
}
for k, v := range container.Spec.Labels {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should containerLabelPrefix be exported so custom mills can use it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@grobie grobie force-pushed the grobie/container-label-filter branch 2 times, most recently from 5dc50b4 to 303e4ac Compare August 22, 2016 00:36
// container name, first alias, image name as well as all its env and label
// values.
func DefaultContainerLabels(container *info.ContainerInfo) map[string]string {
set := map[string]string{"id": container.Name}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good enough for me - thanks!

@jimmidyson
Copy link
Collaborator

LGTM - @timstclair?

@fabxc
Copy link

fabxc commented Aug 22, 2016

LGTM

@jimmidyson
Copy link
Collaborator

@timstclair Can we get this in 0.24 please?

@timstclair timstclair self-assigned this Aug 22, 2016
@timstclair
Copy link
Contributor

ok to test

@timstclair
Copy link
Contributor

@jimmidyson Yes. I just retagged v0.24.0 to v0.24.0-alpha1 (I know it's bad to change tags, but that was the original intent). We can cherrypick this into the release-v0.24 branch so it will be included in the actual v0.24 release in a couple weeks.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Aug 22, 2016

Jenkins GCE e2e

Build/test failed for commit 303e4ac.

@timstclair
Copy link
Contributor

Oh yeah, our cAdvisor e2e tests are hosed right now. I'll retest this once they're back up and running.

}
if image := container.Spec.Image; len(image) > 0 {
set["image"] = image
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we export these intermediate steps for reuse? Or at least make the keys (image, name, id) exported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure about exposing so many parts. If someone overwrites the function, it applies to their whole setup anyway, why would they need to care about how the labels in other clusters look like?

I could split up the id+name+image part and the labels+env parts into two functions we expose each? I'm also fine with exposing constants for the labels though.

Lastly, actually we're doing it all wrong here by applying so many labels to all metrics. It should be only one identifier (probably id) and export another metric containing the mapping. See http://www.robustperception.io/target-labels-are-for-life-not-just-for-christmas/ and http://www.robustperception.io/exposing-the-software-version-to-prometheus/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very last part, the env labels have never worked for me so far, they're always empty (I think that's a bug somewhere in cadvisor). When I use docker inspect <container> I see many envs listed for kubernetes managed containers, but I've never seen any env labels on our metrics.

Shall we just remove the env part as we're already working on that part? @jimmidyson

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could split up the id+name+image part and the labels+env parts into two functions we expose each? I'm also fine with exposing constants for the labels though.

Thinking about this some more, I think just exposing the constants is the way to go.

Lastly, actually we're doing it all wrong here by applying so many labels to all metrics. It should be only one identifier (probably id) and export another metric containing the mapping. See http://www.robustperception.io/target-labels-are-for-life-not-just-for-christmas/ and http://www.robustperception.io/exposing-the-software-version-to-prometheus/.

Ack. I think this is too big a change for this late in the release though. Could you open an issue with a proposal to fix it, and we can try to address it in the next (v0.25) release. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the env labels have never worked for me so far, they're always empty

IIRC env vars are whitelisted by the docker_env_metadata_whitelist flag. By default that is empty as env vars could potentially contain sensitive data so must be consciously whitelisted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. Thanks for the explanation. I still think it's a bad practice to expose such as labels, but I'll not introduce more breaking behavior here and mention that as well in a follow up issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timstclair Done. Extracted the id/name/image to be exported constants.

@grobie
Copy link
Contributor Author

grobie commented Aug 22, 2016

I can re-open this PR against the release-v0.24 branch if that helps to
avoid cherry-picking?

On Mon, Aug 22, 2016 at 3:08 PM Tim St. Clair notifications@github.com
wrote:

@jimmidyson https://github.com/jimmidyson Yes. I just retagged v0.24.0
to v0.24.0-alpha1 (I know it's bad to change tags, but that was the
original intent). We can cherrypick this into the release-v0.24 branch so
it will be included in the actual v0.24 release in a couple weeks.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1429 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAANaGjfqdMO0R6uhHjyJAgEm-q82AnCks5qifOfgaJpZM4JpKV_
.

@timstclair
Copy link
Contributor

No, it should be merged into master too.

@timstclair
Copy link
Contributor

Just a couple small comments. Mostly looks good.

@timstclair
Copy link
Contributor

@k8s-bot test this

@grobie
Copy link
Contributor Author

grobie commented Aug 22, 2016

@timstclair of course. I thought the general pattern for bug fixes is to apply them against the release branch and then merge that release branch into master. but whatever works for you.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Aug 22, 2016

Jenkins GCE e2e

Build/test failed for commit 303e4ac.

@timstclair
Copy link
Contributor

Yeah, we could consider adopting that practice, but for now I'm just mirroring the practices used by Kubernetes (which is to cherry-pick from master into release).

@timstclair
Copy link
Contributor

@k8s-bot test this

@k8s-bot
Copy link
Collaborator

k8s-bot commented Aug 22, 2016

Jenkins GCE e2e

Build/test passed for commit 303e4ac.

@grobie grobie force-pushed the grobie/container-label-filter branch from 303e4ac to 84faf9c Compare August 22, 2016 21:22
This change generalizes the existing ContainerNameToLabelsFunc to allow the user to fully control all labels attached to exported Prometheus metrics. The existing behavior is available as DefaultContainerLabelsFunc and is used if no custom function is provided.

This will allow Kubernetes to filter out its internal Docker labels.
@grobie grobie force-pushed the grobie/container-label-filter branch from 84faf9c to e76096d Compare August 22, 2016 21:26
@timstclair
Copy link
Contributor

LGTM. Waiting for tests to merge.

@grobie
Copy link
Contributor Author

grobie commented Aug 22, 2016

Cool, thanks. Once this is merged, I'll update my kubernetes PR to use the constants and the real master commit ID in godeps.json.

@timstclair
Copy link
Contributor

ok to test

@timstclair
Copy link
Contributor

@k8s-bot test this

@k8s-bot
Copy link
Collaborator

k8s-bot commented Aug 22, 2016

Jenkins GCE e2e

Build/test passed for commit e76096d.

@timstclair timstclair merged commit cd360d6 into google:master Aug 22, 2016
@timstclair
Copy link
Contributor

Please send a PR with this change against the release-v0.24 branch, then you can update the k8s godeps from there.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Aug 26, 2016
Automatic merge from submit-queue

Filter internal Kubernetes labels from Prometheus metrics

**What this PR does / why we need it**:

Kubernetes uses Docker labels as storage for some internal labels. The
majority of these labels are not meaningful metric labels and a few of
them are even harmful as they're not static and cause wrong aggregation
results.

This change provides a custom labels func to only attach meaningful
labels to cAdvisor exported metrics.

**Which issue this PR fixes**

google/cadvisor#1312

**Special notes for your reviewer**:

Depends on google/cadvisor#1429. Once that is merged, I'll update the vendor update commit.

**Release note**:

```release-note
Remove environment variables and internal Kubernetes Docker labels from cAdvisor Prometheus metric labels.

Old behavior:

- environment variables explicitly whitelisted via --docker-env-metadata-whitelist were exported as `container_env_*=*`. Default is zero so by default non were exported
- all docker labels were exported as `container_label_*=*`

New behavior:

- Only `container_name`, `pod_name`, `namespace`, `id`, `image`, and `name` labels are exposed
- no environment variables will be exposed ever via /metrics, even if whitelisted
```

---

Given that we have full control over the exported label set, I shortened the pod_name, pod_namespace and container_name label names. Below an example of the change (reformatted for readability).

```
# BEFORE
container_cpu_cfs_periods_total{
  container_label_io_kubernetes_container_hash="5af8c3b4",
  container_label_io_kubernetes_container_name="sync",
  container_label_io_kubernetes_container_restartCount="1",
  container_label_io_kubernetes_container_terminationMessagePath="/dev/termination-log",
  container_label_io_kubernetes_pod_name="popularsearches-web-3165456836-2bfey",
  container_label_io_kubernetes_pod_namespace="popularsearches",
  container_label_io_kubernetes_pod_terminationGracePeriod="30",
  container_label_io_kubernetes_pod_uid="6a291e48-47c4-11e6-84a4-c81f66bdf8bd",
  id="/docker/68e1f15353921f4d6d4d998fa7293306c4ac828d04d1284e410ddaa75cf8cf25",
  image="redacted.com/popularsearches:42-16-ba6bd88",
  name="k8s_sync.5af8c3b4_popularsearches-web-3165456836-2bfey_popularsearches_6a291e48-47c4-11e6-84a4-c81f66bdf8bd_c02d3775"
} 72819

# AFTER
container_cpu_cfs_periods_total{
  container_name="sync",
  pod_name="popularsearches-web-3165456836-2bfey",
  namespace="popularsearches",
  id="/docker/68e1f15353921f4d6d4d998fa7293306c4ac828d04d1284e410ddaa75cf8cf25",
  image="redacted.com/popularsearches:42-16-ba6bd88",
  name="k8s_sync.5af8c3b4_popularsearches-web-3165456836-2bfey_popularsearches_6a291e48-47c4-11e6-84a4-c81f66bdf8bd_c02d3775"
} 72819
```

Feedback requested on:
* Label names. Other suggestions? Should we keep these very long ones?
* Do we need to export io.kubernetes.pod.uid? It makes working with the metrics a bit more complicated and the pod name is already unique at any time (but not over time). The UID is aslo part of `name`.

As discussed with @timstclair, this should be added to v1.4 as the current labels are harmful.

PTAL @jimmidyson @fabxc @vishh
@cirocosta cirocosta mentioned this pull request Aug 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants