-
Notifications
You must be signed in to change notification settings - Fork 204
OCPBUGS-57585: CVO protects /metrics with authorization #1215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-57585: CVO protects /metrics with authorization #1215
Conversation
/retest |
/retest-required |
The cluster bot job: ![]() $ TOKEN=$(oc create token -n openshift-monitoring prometheus-k8s)
$ oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I -H "Authorization: Bearer $TOKEN" https://10.0.0.3:9099/metrics
HTTP/1.1 200 OK
Content-Type: text/plain; version=0.0.4; charset=utf-8; escaping=values
Date: Tue, 22 Jul 2025 21:37:03 GMT
$ TOKEN=$(oc create token -n openshift-monitoring default)
$ oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I -H "Authorization: Bearer $TOKEN" https://10.0.0.3:9099/metrics
HTTP/1.1 401 Unauthorized
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Tue, 22 Jul 2025 21:38:49 GMT
Content-Length: 20
$ oc debug node/ci-ln-zms0nzb-72292-mq2qr-master-0
Starting pod/ci-ln-zms0nzb-72292-mq2qr-master-0-debug-qhmxg ...
To use host binaries, run `chroot /host`. Instead, if you need to access host namespaces, run `nsenter -a -t 1`.
Pod IP: 10.0.0.3
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# curl -k https://10.0.0.3:9099/metrics
failed to get the Authorization header |
/test e2e-hypershift-conformance |
/label qe-approved |
3b92430
to
c5fbda0
Compare
/cc |
client tokenReviewInterface | ||
} | ||
|
||
func (a *authHandler) authorize(token string) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the code and noticed that we would only support Bearer token auth but I remember from the handbook we are supposed to auth a cert-presenting client:
As described in the Client certificate scraping enhancement proposal, we recommend that the components rely on client TLS certificates for authentication/authorization. This is more efficient and robust than using bearer tokens because token-based authn/authz add a dependency (and additional load) on the Kubernetes API.
It seems that it is actually us telling the monitoring stack how it should auth to us through the ServiceMonitor
manifest .spec.endpoints[].bearerTokenFile
.
In that aspect this PR is incomplete but maybe doing just Bearer token auth is fine for a fast OCPBUGS-57585 bandaid that allows us to start backporting, and we would tackle the cert auth separately and only forwards (not necessary to backport). But also:
$ oc explain servicemonitor.spec.endpoints.bearerTokenFile
GROUP: monitoring.coreos.com
KIND: ServiceMonitor
VERSION: v1
FIELD: bearerTokenFile <string>
DESCRIPTION:
File to read bearer token for scraping the target.
Deprecated: use `authorization` instead.
😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shortly after the discussion in our meeting, I realized i have done the same thing in ci-tools for the same reason: We do not want to reply on K8S API server for scraping because it is too slow and it may create burden for K8S API server.
I will create a card to replace the deprecated servicemonitor.spec.endpoints.bearerTokenFile
with servicemonitor.spec.endpoints.authorization
and I will argue in the card that with a solution that addresses the concern above and we do not need to move to the cert-based auth. But that is not in the scope of this pull. I will move the discussion there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c5fbda0
to
833a491
Compare
@hongkailiu: This pull request references Jira Issue OCPBUGS-57585, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test e2e-aws-ovn-techpreview |
1 similar comment
/test e2e-aws-ovn-techpreview |
Test from openshift/origin#30014 is PASSED.
|
Test from https://github.com/openshift/origin/blob/main/test/extended/prometheus/prometheus.go#L514 which is covering for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the Kube API server load of the TokenReview calls for now, with the future off ramp being investigated in OTA-1594.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hongkailiu, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
tech-preview failure:
is unrelated to this pull request. /override ci/prow/e2e-aws-ovn-techpreview |
@wking: Overrode contexts on behalf of wking: ci/prow/e2e-aws-ovn-techpreview In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@hongkailiu: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@hongkailiu: Jira Issue OCPBUGS-57585: Some pull requests linked via external trackers have merged: The following pull requests linked via external trackers have not merged:
These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-57585 has not been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@wking: Jira Issue OCPBUGS-57585: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-57585 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@wking: new pull request created: #1222 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[ART PR BUILD NOTIFIER] Distgit: cluster-version-operator |
The
/metrics
is protected byauthHandler
introduced from this pull.authHandler
allows only for requests with the bearer token associated withsystem:serviceaccount:openshift-monitoring:prometheus-k8s
.