Skip to content

OCPBUGS-57585: CVO protects /metrics with authorization #1215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

hongkailiu
Copy link
Member

@hongkailiu hongkailiu commented Jul 22, 2025

The /metrics is protected by authHandler introduced from this pull.

authHandler allows only for requests with the bearer token associated with system:serviceaccount:openshift-monitoring:prometheus-k8s.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 22, 2025
@hongkailiu
Copy link
Member Author

/retest

@hongkailiu hongkailiu changed the title [wip]CVO protects /metrics with authorization [OCPBUGS-57585]CVO protects /metrics with authorization Jul 22, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 22, 2025
@hongkailiu
Copy link
Member Author

/retest-required

@hongkailiu
Copy link
Member Author

launch 4.20,openshift/cluster-version-operator#1215 gcp,single-node

The cluster bot job:
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-gcp-modern/1947715499110436864

Screenshot 2025-07-22 at 17 27 04
$ TOKEN=$(oc create token -n openshift-monitoring prometheus-k8s)
$ oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I -H "Authorization: Bearer $TOKEN" https://10.0.0.3:9099/metrics
HTTP/1.1 200 OK
Content-Type: text/plain; version=0.0.4; charset=utf-8; escaping=values
Date: Tue, 22 Jul 2025 21:37:03 GMT

$ TOKEN=$(oc create token -n openshift-monitoring default)
$ oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I -H "Authorization: Bearer $TOKEN" https://10.0.0.3:9099/metrics
HTTP/1.1 401 Unauthorized
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Tue, 22 Jul 2025 21:38:49 GMT
Content-Length: 20

$ oc debug node/ci-ln-zms0nzb-72292-mq2qr-master-0
Starting pod/ci-ln-zms0nzb-72292-mq2qr-master-0-debug-qhmxg ...
To use host binaries, run `chroot /host`. Instead, if you need to access host namespaces, run `nsenter -a -t 1`.
Pod IP: 10.0.0.3
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# curl -k https://10.0.0.3:9099/metrics
failed to get the Authorization header

@hongkailiu
Copy link
Member Author

/test e2e-hypershift-conformance

@dis016
Copy link

dis016 commented Jul 23, 2025

Test Scenario: Metrics should be available only with authorization.
Test Status: Pass
Step1: Install a cluster with changes in the PR

launch 4.20,openshift/cluster-version-operator#1215 aws

cluster job https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-aws-modern/1947883892576882688

Cluster Installed successfully.

dinesh@Dineshs-MacBook-Pro ~ % oc get clusterversion 
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.20.0-0-2025-07-23-051116-test-ci-ln-hwc8k5t-latest   True        False         17m     Cluster version is 4.20.0-0-2025-07-23-051116-test-ci-ln-hwc8k5t-latest
dinesh@Dineshs-MacBook-Pro ~ %

Step2: get the service IP for CVO metrics

dinesh@Dineshs-MacBook-Pro ~ % oc get svc -n openshift-cluster-version
NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
cluster-version-operator   ClusterIP   172.30.226.240   <none>        9099/TCP   45m
dinesh@Dineshs-MacBook-Pro ~ % 

Step3: Try accessing the CVO metrics without Token should return unauthorized

dinesh@Dineshs-MacBook-Pro ~ % oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I  https://172.30.226.240:9099/metrics  
HTTP/1.1 401 Unauthorized
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Wed, 23 Jul 2025 06:13:47 GMT
Content-Length: 39

dinesh@Dineshs-MacBook-Pro ~ % 

Step4: Try accessing CVO metrics with default monitoring token should return unauthorized

dinesh@Dineshs-MacBook-Pro ~ % TOKEN=$(oc create token -n openshift-monitoring default)
dinesh@Dineshs-MacBook-Pro ~ % oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I -H "Authorization: Bearer $TOKEN" https://172.30.226.240:9099/metrics 
HTTP/1.1 401 Unauthorized
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Wed, 23 Jul 2025 06:31:55 GMT
Content-Length: 20

dinesh@Dineshs-MacBook-Pro ~ %

Step5: Try accessing CVO metrics with prometheus-k8s token and metrics should be served.

dinesh@Dineshs-MacBook-Pro ~ % TOKEN=$(oc create token -n openshift-monitoring prometheus-k8s)
dinesh@Dineshs-MacBook-Pro ~ % oc exec -n openshift-monitoring prometheus-k8s-0 -- curl -s -k -I -H "Authorization: Bearer $TOKEN" https://172.30.226.240:9099/metrics 
HTTP/1.1 200 OK
Content-Type: text/plain; version=0.0.4; charset=utf-8; escaping=values
Date: Wed, 23 Jul 2025 06:34:22 GMT

dinesh@Dineshs-MacBook-Pro ~ % 

Step6: Promotheus is able to communicate to CVO and monitoring scrape should be healthy by having value as "1"


dinesh@Dineshs-MacBook-Pro ~ % PROM_POD=$(oc get pods -n openshift-monitoring -l app.kubernetes.io/name=prometheus -o jsonpath='{.items[0].metadata.name}')
dinesh@Dineshs-MacBook-Pro ~ % oc exec -n openshift-monitoring "$PROM_POD" -- \
  curl -s "http://localhost:9090/api/v1/query?query=up%7Bnamespace%3D%22openshift-cluster-version%22%2Cjob%3D%22cluster-version-operator%22%7D" | jq
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "up",
          "container": "cluster-version-operator",
          "endpoint": "metrics",
          "instance": "10.0.67.98:9099",
          "job": "cluster-version-operator",
          "namespace": "openshift-cluster-version",
          "pod": "cluster-version-operator-6d66c465b7-xvc42",
          "service": "cluster-version-operator"
        },
        "value": [
          1753252605.131,
          "1"
        ]
      }
    ]
  }
}
dinesh@Dineshs-MacBook-Pro ~ % 

Step7: CVO metrics should not be served within pod as well without authorization.
get the node for CVO pod


dinesh@Dineshs-MacBook-Pro ~ % oc get pod -n openshift-cluster-version -o wide 
NAME                                        READY   STATUS    RESTARTS   AGE   IP           NODE                                       NOMINATED NODE   READINESS GATES
cluster-version-operator-6d66c465b7-xvc42   1/1     Running   0          68m   10.0.67.98   ip-10-0-67-98.us-west-1.compute.internal   <none>           <none>
dinesh@Dineshs-MacBook-Pro ~ %

Try accessing the metrics with host should return authorization error.


dinesh@Dineshs-MacBook-Pro ~ % oc debug node/ip-10-0-67-98.us-west-1.compute.internal                                                                                  
Starting pod/ip-10-0-67-98us-west-1computeinternal-debug-49vc6 ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.67.98
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# curl -k https://172.30.226.240:9099/metrics 
failed to get the Authorization header
sh-5.1# curl -k https://10.0.67.98:9099/metrics 
failed to get the Authorization header
sh-5.1# exit
exit
sh-5.1# exit
exit

Removing debug pod ...
dinesh@Dineshs-MacBook-Pro ~ % 

Step8: CVO metrics are available in console.

Screenshot 2025-07-23 at 11 51 45 AM

@dis016
Copy link

dis016 commented Jul 23, 2025

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jul 23, 2025
@hongkailiu hongkailiu force-pushed the OCPBUGS-57585-TokenReview branch from 3b92430 to c5fbda0 Compare July 23, 2025 10:10
@petr-muller
Copy link
Member

/cc

client tokenReviewInterface
}

func (a *authHandler) authorize(token string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the code and noticed that we would only support Bearer token auth but I remember from the handbook we are supposed to auth a cert-presenting client:

As described in the Client certificate scraping enhancement proposal, we recommend that the components rely on client TLS certificates for authentication/authorization. This is more efficient and robust than using bearer tokens because token-based authn/authz add a dependency (and additional load) on the Kubernetes API.

It seems that it is actually us telling the monitoring stack how it should auth to us through the ServiceMonitor manifest .spec.endpoints[].bearerTokenFile.

In that aspect this PR is incomplete but maybe doing just Bearer token auth is fine for a fast OCPBUGS-57585 bandaid that allows us to start backporting, and we would tackle the cert auth separately and only forwards (not necessary to backport). But also:

$ oc explain servicemonitor.spec.endpoints.bearerTokenFile
GROUP:      monitoring.coreos.com
KIND:       ServiceMonitor
VERSION:    v1

FIELD: bearerTokenFile <string>


DESCRIPTION:
    File to read bearer token for scraping the target.
    
    Deprecated: use `authorization` instead.

😬

Copy link
Member Author

@hongkailiu hongkailiu Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shortly after the discussion in our meeting, I realized i have done the same thing in ci-tools for the same reason: We do not want to reply on K8S API server for scraping because it is too slow and it may create burden for K8S API server.

I will create a card to replace the deprecated servicemonitor.spec.endpoints.bearerTokenFile with servicemonitor.spec.endpoints.authorization and I will argue in the card that with a solution that addresses the concern above and we do not need to move to the cert-based auth. But that is not in the scope of this pull. I will move the discussion there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hongkailiu hongkailiu force-pushed the OCPBUGS-57585-TokenReview branch from c5fbda0 to 833a491 Compare July 23, 2025 18:24
@hongkailiu hongkailiu requested a review from petr-muller July 23, 2025 18:39
@hongkailiu hongkailiu changed the title [OCPBUGS-57585]CVO protects /metrics with authorization OCPBUGS-57585: CVO protects /metrics with authorization Jul 23, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jul 23, 2025
@openshift-ci-robot
Copy link
Contributor

@hongkailiu: This pull request references Jira Issue OCPBUGS-57585, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @dis016

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

The /metrics is protected by authHandler introduced from this pull.

authHandler allows only for requests with the bearer token associated with system:serviceaccount:openshift-monitoring:prometheus-k8s.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from dis016 July 23, 2025 20:15
@hongkailiu
Copy link
Member Author

/test e2e-aws-ovn-techpreview

1 similar comment
@hongkailiu
Copy link
Member Author

/test e2e-aws-ovn-techpreview

@dis016
Copy link

dis016 commented Jul 29, 2025

Test from openshift/origin#30014 is PASSED.

dinesh@Dineshs-MacBook-Pro origin % export GCP_SHARED_CREDENTIALS_FILE=/tmp/gce.json 
dinesh@Dineshs-MacBook-Pro origin % export COMPONENT_NAMESPACE=openshift-cluster-version
dinesh@Dineshs-MacBook-Pro origin % export KUBECONFIG=/Users/dinesh/Downloads/kubeconfig 
dinesh@Dineshs-MacBook-Pro origin % oc get clusterversion                                                                                                                                            
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.20.0-0-2025-07-29-121153-test-ci-ln-7dfzt7b-latest   True        False         23m     Cluster version is 4.20.0-0-2025-07-29-121153-test-ci-ln-7dfzt7b-latest
dinesh@Dineshs-MacBook-Pro origin % cat /tmp/gce.json 
{}
dinesh@Dineshs-MacBook-Pro origin % ./openshift-tests run-test "[sig-instrumentation][Late] OpenShift service monitors [apigroup:image.openshift.io] should not be accessible without authorization" 
  I0729 18:49:57.334617   20062 i18n.go:139] Couldn't find translations for en_IN, using default
  I0729 18:49:57.334913   20062 i18n.go:157] Setting language to default
openshift-tests v3.7.0-alpha.0-18453-gef1d7b2
  I0729 18:50:00.742976   20062 test_setup.go:94] Extended test version v3.7.0-alpha.0-18453-gef1d7b2
  I0729 18:50:00.743031   20062 test_context.go:558] Tolerating taints "node-role.kubernetes.io/control-plane" when considering if nodes are ready
  I0729 18:50:01.036599 20062 framework.go:2317] microshift-version configmap not found
  I0729 18:50:01.036665   20062 binary.go:111] Loaded test configuration: &framework.TestContextType{KubeConfig:"/Users/dinesh/Downloads/kubeconfig", KubeContext:"", KubeAPIContentType:"application/vnd.kubernetes.protobuf", KubeletRootDir:"/var/lib/kubelet", KubeletConfigDropinDir:"", CertDir:"", Host:"https://api.ci-ln-7dfzt7b-72292.gcp-2.ci.openshift.org:6443", BearerToken:"1lgOEGnnF8PvhloA", RepoRoot:"../../", ListImages:false, listTests:false, listLabels:false, ListConformanceTests:false, Provider:"gce", Tooling:"", timeouts:framework.TimeoutContext{Poll:2000000000, PodStart:300000000000, PodStartShort:120000000000, PodStartSlow:900000000000, PodDelete:300000000000, ClaimProvision:300000000000, DataSourceProvision:300000000000, ClaimProvisionShort:60000000000, ClaimBound:180000000000, PVReclaim:180000000000, PVBound:180000000000, PVCreate:180000000000, PVDelete:300000000000, PVDeleteSlow:1200000000000, SnapshotCreate:300000000000, SnapshotDelete:300000000000, SnapshotControllerMetrics:300000000000, SystemPodsStartup:600000000000, NodeSchedulable:1800000000000, SystemDaemonsetStartup:300000000000, NodeNotReady:180000000000}, CloudConfig:framework.CloudConfig{APIEndpoint:"", ProjectID:"openshift-gce-devel-ci-2", Zone:"us-central1-a", Zones:[]string{"us-central1-a"}, Region:"us-central1", MultiZone:false, MultiMaster:false, Cluster:"", MasterName:"", NodeInstanceGroup:"", NumNodes:0, ClusterIPRange:"", ClusterTag:"", Network:"", ConfigFile:"", NodeTag:"", MasterTag:"", Provider:(*framework.NullProvider)(0x10c7a6268)}, KubectlPath:"kubectl", OutputDir:"/tmp", ReportDir:"", ReportPrefix:"", ReportCompleteGinkgo:false, ReportCompleteJUnit:false, Prefix:"e2e", MinStartupPods:-1, EtcdUpgradeStorage:"", EtcdUpgradeVersion:"", GCEUpgradeScript:"", ContainerRuntimeEndpoint:"unix:///run/containerd/containerd.sock", ContainerRuntimeProcessName:"containerd", ContainerRuntimePidFile:"/run/containerd/containerd.pid", SystemdServices:"containerd*", DumpSystemdJournal:false, ImageServiceEndpoint:"", MasterOSDistro:"custom", NodeOSDistro:"custom", NodeOSArch:"amd64", VerifyServiceAccount:true, DeleteNamespace:true, DeleteNamespaceOnFailure:true, AllowedNotReadyNodes:-1, CleanStart:false, GatherKubeSystemResourceUsageData:"false", GatherLogsSizes:false, GatherMetricsAfterTest:"false", GatherSuiteMetricsAfterTest:false, MaxNodesToGather:0, IncludeClusterAutoscalerMetrics:false, OutputPrintType:"json", CreateTestingNS:(framework.CreateTestingNSFn)(0x1043fbb20), DumpLogsOnFailure:true, DisableLogDump:false, LogexporterGCSPath:"", NodeTestContextType:framework.NodeTestContextType{NodeE2E:false, NodeName:"", NodeConformance:false, PrepullImages:false, ImageDescription:"", RuntimeConfig:map[string]string(nil), SystemSpecName:"", RestartKubelet:false, ExtraEnvs:map[string]string(nil), StandaloneMode:false, CriProxyEnabled:false}, ClusterDNSDomain:"cluster.local", NodeKiller:framework.NodeKillerConfig{Enabled:false, FailureRatio:0.01, Interval:60000000000, JitterFactor:60, SimulatedDowntime:600000000000, NodeKillerStopCtx:context.Context(nil), NodeKillerStop:(func())(nil)}, IPFamily:"ipv4", NonblockingTaints:"node-role.kubernetes.io/control-plane", ProgressReportURL:"", SriovdpConfigMapFile:"", SpecSummaryOutput:"", DockerConfigFile:"", E2EDockerConfigFile:"", KubeTestRepoList:"", SnapshotControllerPodName:"", SnapshotControllerHTTPPort:0, RequireDevices:false, EnabledVolumeDrivers:[]string(nil)}
  Running Suite:  - /Users/dinesh/Openshift_Project/origin
  ========================================================
  Random Seed: 1753795197 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [sig-instrumentation][Late] OpenShift service monitors [apigroup:image.openshift.io] should not be accessible without authorization
  github.com/openshift/origin/test/extended/prometheus/prometheus.go:72
    STEP: Creating a kubernetes client @ 07/29/25 18:50:01.047
  I0729 18:50:01.048673   20062 discovery.go:214] Invalidating discovery information
    STEP: verifying all service monitors are configured with authorization @ 07/29/25 18:50:01.657
  I0729 18:50:02.060088 20062 prometheus.go:92] service monitor openshift-cluster-version/cluster-version-operator has authorization
    STEP: verifying all targets returns 401 or 403 without authorization @ 07/29/25 18:50:02.06
  I0729 18:50:04.151007 20062 builder.go:121] Running '/usr/local/bin/kubectl --server=https://api.ci-ln-7dfzt7b-72292.gcp-2.ci.openshift.org:6443 --kubeconfig=/Users/dinesh/Downloads/kubeconfig --namespace=openshift-monitoring exec prometheus-k8s-0 -- /bin/sh -x -c curl -k -s -o /dev/null -w '%{http_code}' "https://10.0.0.3:9099/metrics"'
  I0729 18:50:07.716474 20062 builder.go:146] stderr: "+ curl -k -s -o /dev/null -w '%{http_code}' https://10.0.0.3:9099/metrics\n"
  I0729 18:50:07.716557 20062 builder.go:147] stdout: "401"
  I0729 18:50:07.716577 20062 prometheus.go:116] the scaple url https://10.0.0.3:9099/metrics for namespace openshift-cluster-version is not accessible without authorization
  • [6.680 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 6.681 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[sig-instrumentation][Late] OpenShift service monitors [apigroup:image.openshift.io] should not be accessible without authorization",
    "lifecycle": "blocking",
    "duration": 6680,
    "startTime": "2025-07-29 13:20:01.037156 UTC",
    "endTime": "2025-07-29 13:20:07.717980 UTC",
    "result": "passed",
    "output": "  STEP: Creating a kubernetes client @ 07/29/25 18:50:01.047\n  STEP: verifying all service monitors are configured with authorization @ 07/29/25 18:50:01.657\nI0729 18:50:02.060088 20062 prometheus.go:92] service monitor openshift-cluster-version/cluster-version-operator has authorization\n  STEP: verifying all targets returns 401 or 403 without authorization @ 07/29/25 18:50:02.06\nI0729 18:50:04.151007 20062 builder.go:121] Running '/usr/local/bin/kubectl --server=https://api.ci-ln-7dfzt7b-72292.gcp-2.ci.openshift.org:6443 --kubeconfig=/Users/dinesh/Downloads/kubeconfig --namespace=openshift-monitoring exec prometheus-k8s-0 -- /bin/sh -x -c curl -k -s -o /dev/null -w '%{http_code}' \"https://10.0.0.3:9099/metrics\"'\nI0729 18:50:07.716474 20062 builder.go:146] stderr: \"+ curl -k -s -o /dev/null -w '%{http_code}' https://10.0.0.3:9099/metrics\\n\"\nI0729 18:50:07.716557 20062 builder.go:147] stdout: \"401\"\nI0729 18:50:07.716577 20062 prometheus.go:116] the scaple url https://10.0.0.3:9099/metrics for namespace openshift-cluster-version is not accessible without authorization\n"
  }
]%                                                                                                                                                                                                          dinesh@Dineshs-MacBook-Pro origin % 

@dis016
Copy link

dis016 commented Jul 29, 2025

Test from https://github.com/openshift/origin/blob/main/test/extended/prometheus/prometheus.go#L514 which is covering for
// Cluster version operator targets.Expect(labels{"job": "cluster-version-operator"}, "up", "^https://.*/metrics$")
also Passed

dinesh@Dineshs-MacBook-Pro origin % ./openshift-tests run-test "[sig-instrumentation] Prometheus [apigroup:image.openshift.io] when installed on the cluster should start and expose a secured proxy and unsecured metrics [apigroup:config.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/parallel]"
  I0729 21:26:06.214644   23681 i18n.go:139] Couldn't find translations for en_IN, using default
  I0729 21:26:06.214861   23681 i18n.go:157] Setting language to default
openshift-tests v3.7.0-alpha.0-18453-gef1d7b2
  I0729 21:26:09.924900   23681 test_setup.go:94] Extended test version v3.7.0-alpha.0-18453-gef1d7b2
  I0729 21:26:09.928594   23681 test_context.go:558] Tolerating taints "node-role.kubernetes.io/control-plane" when considering if nodes are ready
  I0729 21:26:10.332418 23681 framework.go:2317] microshift-version configmap not found
  I0729 21:26:10.334433   23681 binary.go:111] Loaded test configuration: &framework.TestContextType{KubeConfig:"/Users/dinesh/Downloads/kubeconfig", KubeContext:"", KubeAPIContentType:"application/vnd.kubernetes.protobuf", KubeletRootDir:"/var/lib/kubelet", KubeletConfigDropinDir:"", CertDir:"", Host:"https://api.ci-ln-7dfzt7b-72292.gcp-2.ci.openshift.org:6443", BearerToken:"6sfT2_wpan_9g6HH", RepoRoot:"../../", ListImages:false, listTests:false, listLabels:false, ListConformanceTests:false, Provider:"gce", Tooling:"", timeouts:framework.TimeoutContext{Poll:2000000000, PodStart:300000000000, PodStartShort:120000000000, PodStartSlow:900000000000, PodDelete:300000000000, ClaimProvision:300000000000, DataSourceProvision:300000000000, ClaimProvisionShort:60000000000, ClaimBound:180000000000, PVReclaim:180000000000, PVBound:180000000000, PVCreate:180000000000, PVDelete:300000000000, PVDeleteSlow:1200000000000, SnapshotCreate:300000000000, SnapshotDelete:300000000000, SnapshotControllerMetrics:300000000000, SystemPodsStartup:600000000000, NodeSchedulable:1800000000000, SystemDaemonsetStartup:300000000000, NodeNotReady:180000000000}, CloudConfig:framework.CloudConfig{APIEndpoint:"", ProjectID:"openshift-gce-devel-ci-2", Zone:"us-central1-a", Zones:[]string{"us-central1-a"}, Region:"us-central1", MultiZone:false, MultiMaster:false, Cluster:"", MasterName:"", NodeInstanceGroup:"", NumNodes:0, ClusterIPRange:"", ClusterTag:"", Network:"", ConfigFile:"", NodeTag:"", MasterTag:"", Provider:(*framework.NullProvider)(0x118f62268)}, KubectlPath:"kubectl", OutputDir:"/tmp", ReportDir:"", ReportPrefix:"", ReportCompleteGinkgo:false, ReportCompleteJUnit:false, Prefix:"e2e", MinStartupPods:-1, EtcdUpgradeStorage:"", EtcdUpgradeVersion:"", GCEUpgradeScript:"", ContainerRuntimeEndpoint:"unix:///run/containerd/containerd.sock", ContainerRuntimeProcessName:"containerd", ContainerRuntimePidFile:"/run/containerd/containerd.pid", SystemdServices:"containerd*", DumpSystemdJournal:false, ImageServiceEndpoint:"", MasterOSDistro:"custom", NodeOSDistro:"custom", NodeOSArch:"amd64", VerifyServiceAccount:true, DeleteNamespace:true, DeleteNamespaceOnFailure:true, AllowedNotReadyNodes:-1, CleanStart:false, GatherKubeSystemResourceUsageData:"false", GatherLogsSizes:false, GatherMetricsAfterTest:"false", GatherSuiteMetricsAfterTest:false, MaxNodesToGather:0, IncludeClusterAutoscalerMetrics:false, OutputPrintType:"json", CreateTestingNS:(framework.CreateTestingNSFn)(0x110bb7b20), DumpLogsOnFailure:true, DisableLogDump:false, LogexporterGCSPath:"", NodeTestContextType:framework.NodeTestContextType{NodeE2E:false, NodeName:"", NodeConformance:false, PrepullImages:false, ImageDescription:"", RuntimeConfig:map[string]string(nil), SystemSpecName:"", RestartKubelet:false, ExtraEnvs:map[string]string(nil), StandaloneMode:false, CriProxyEnabled:false}, ClusterDNSDomain:"cluster.local", NodeKiller:framework.NodeKillerConfig{Enabled:false, FailureRatio:0.01, Interval:60000000000, JitterFactor:60, SimulatedDowntime:600000000000, NodeKillerStopCtx:context.Context(nil), NodeKillerStop:(func())(nil)}, IPFamily:"ipv4", NonblockingTaints:"node-role.kubernetes.io/control-plane", ProgressReportURL:"", SriovdpConfigMapFile:"", SpecSummaryOutput:"", DockerConfigFile:"", E2EDockerConfigFile:"", KubeTestRepoList:"", SnapshotControllerPodName:"", SnapshotControllerHTTPPort:0, RequireDevices:false, EnabledVolumeDrivers:[]string(nil)}
  Running Suite:  - /Users/dinesh/Openshift_Project/origin
  ========================================================
  Random Seed: 1753804566 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [sig-instrumentation] Prometheus [apigroup:image.openshift.io] when installed on the cluster should start and expose a secured proxy and unsecured metrics [apigroup:config.openshift.io]
  github.com/openshift/origin/test/extended/prometheus/prometheus.go:603
    STEP: Creating a kubernetes client @ 07/29/25 21:26:10.377
  I0729 21:26:10.382115   23681 discovery.go:214] Invalidating discovery information
  I0729 21:26:14.020219 23681 client.go:286] configPath is now "/var/folders/gw/q6gbymqn2xn3t21cr090k05h0000gn/T/configfile3014027644"
  I0729 21:26:14.020291 23681 client.go:361] The user is now "e2e-test-prometheus-sl8fr-user"
  I0729 21:26:14.020314 23681 client.go:363] Creating project "e2e-test-prometheus-sl8fr"
  I0729 21:26:14.359327 23681 client.go:371] Waiting on permissions in project "e2e-test-prometheus-sl8fr" ...
  I0729 21:26:15.643529 23681 client.go:400] DeploymentConfig capability is enabled, adding 'deployer' SA to the list of default SAs
  I0729 21:26:15.936657 23681 client.go:415] Waiting for ServiceAccount "default" to be provisioned...
  I0729 21:26:16.629734 23681 client.go:415] Waiting for ServiceAccount "builder" to be provisioned...
  I0729 21:26:17.313611 23681 client.go:415] Waiting for ServiceAccount "deployer" to be provisioned...
  I0729 21:26:18.008514 23681 client.go:425] Waiting for RoleBinding "system:image-pullers" to be provisioned...
  I0729 21:26:18.312327 23681 client.go:425] Waiting for RoleBinding "system:image-builders" to be provisioned...
  I0729 21:26:18.932439 23681 client.go:425] Waiting for RoleBinding "system:deployers" to be provisioned...
  I0729 21:26:20.158889 23681 client.go:458] Project "e2e-test-prometheus-sl8fr" has been fully provisioned.
  I0729 21:26:21.988946 23681 resource.go:361] Creating new exec pod
    STEP: checking the prometheus metrics path @ 07/29/25 21:26:36.671
  I0729 21:26:36.672997 23681 client.go:1010] Running 'oc --namespace=e2e-test-prometheus-sl8fr --kubeconfig=/Users/dinesh/Downloads/kubeconfig exec execpod -- curl -s -k -H Authorization: Bearer <redacted> https://prometheus-k8s.openshift-monitoring.svc:9091/metrics'
    STEP: verifying the Thanos querier service requires authentication @ 07/29/25 21:26:42.048
  I0729 21:26:42.049624 23681 builder.go:121] Running '/usr/local/bin/kubectl --server=https://api.ci-ln-7dfzt7b-72292.gcp-2.ci.openshift.org:6443 --kubeconfig=/Users/dinesh/Downloads/kubeconfig --namespace=e2e-test-prometheus-sl8fr exec execpod -- /bin/sh -x -c curl -k -s -o /dev/null -w '%{http_code}' "https://thanos-querier.openshift-monitoring.svc:9091"'
  I0729 21:26:45.502428 23681 builder.go:146] stderr: "+ curl -k -s -o /dev/null -w '%{http_code}' https://thanos-querier.openshift-monitoring.svc:9091\n"
  I0729 21:26:45.502542 23681 builder.go:147] stdout: "401"
    STEP: verifying a service account token is able to authenticate @ 07/29/25 21:26:45.502
    STEP: verifying a service account token is able to access the Prometheus API @ 07/29/25 21:26:46.887
    STEP: verifying all expected jobs have a working target @ 07/29/25 21:26:48.843
    STEP: verifying all targets are exposing metrics over secure channel @ 07/29/25 21:26:49.139
  I0729 21:26:52.721957 23681 client.go:674] Deleted {user.openshift.io/v1, Resource=users  e2e-test-prometheus-sl8fr-user}, err: <nil>
  I0729 21:26:53.015156 23681 client.go:674] Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-prometheus-sl8fr}, err: <nil>
  I0729 21:26:53.309659 23681 client.go:674] Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  sha256~jAh3vZjQkmD4SoSNKKzQmE9tAXFCIT60s6AE3sUbPpQ}, err: <nil>
    STEP: Destroying namespace "e2e-test-prometheus-sl8fr" for this suite. @ 07/29/25 21:26:53.312
  • [43.301 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 43.306 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[sig-instrumentation] Prometheus [apigroup:image.openshift.io] when installed on the cluster should start and expose a secured proxy and unsecured metrics [apigroup:config.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/parallel]",
    "lifecycle": "blocking",
    "duration": 43308,
    "startTime": "2025-07-29 15:56:10.336398 UTC",
    "endTime": "2025-07-29 15:56:53.645094 UTC",
    "result": "passed",
    "output": "  STEP: Creating a kubernetes client @ 07/29/25 21:26:10.377\nI0729 21:26:14.020219 23681 client.go:286] configPath is now \"/var/folders/gw/q6gbymqn2xn3t21cr090k05h0000gn/T/configfile3014027644\"\nI0729 21:26:14.020291 23681 client.go:361] The user is now \"e2e-test-prometheus-sl8fr-user\"\nI0729 21:26:14.020314 23681 client.go:363] Creating project \"e2e-test-prometheus-sl8fr\"\nI0729 21:26:14.359327 23681 client.go:371] Waiting on permissions in project \"e2e-test-prometheus-sl8fr\" ...\nI0729 21:26:15.643529 23681 client.go:400] DeploymentConfig capability is enabled, adding 'deployer' SA to the list of default SAs\nI0729 21:26:15.936657 23681 client.go:415] Waiting for ServiceAccount \"default\" to be provisioned...\nI0729 21:26:16.629734 23681 client.go:415] Waiting for ServiceAccount \"builder\" to be provisioned...\nI0729 21:26:17.313611 23681 client.go:415] Waiting for ServiceAccount \"deployer\" to be provisioned...\nI0729 21:26:18.008514 23681 client.go:425] Waiting for RoleBinding \"system:image-pullers\" to be provisioned...\nI0729 21:26:18.312327 23681 client.go:425] Waiting for RoleBinding \"system:image-builders\" to be provisioned...\nI0729 21:26:18.932439 23681 client.go:425] Waiting for RoleBinding \"system:deployers\" to be provisioned...\nI0729 21:26:20.158889 23681 client.go:458] Project \"e2e-test-prometheus-sl8fr\" has been fully provisioned.\nI0729 21:26:21.988946 23681 resource.go:361] Creating new exec pod\n  STEP: checking the prometheus metrics path @ 07/29/25 21:26:36.671\nI0729 21:26:36.672997 23681 client.go:1010] Running 'oc --namespace=e2e-test-prometheus-sl8fr --kubeconfig=/Users/dinesh/Downloads/kubeconfig exec execpod -- curl -s -k -H Authorization: Bearer \u003credacted\u003e https://prometheus-k8s.openshift-monitoring.svc:9091/metrics'\n  STEP: verifying the Thanos querier service requires authentication @ 07/29/25 21:26:42.048\nI0729 21:26:42.049624 23681 builder.go:121] Running '/usr/local/bin/kubectl --server=https://api.ci-ln-7dfzt7b-72292.gcp-2.ci.openshift.org:6443 --kubeconfig=/Users/dinesh/Downloads/kubeconfig --namespace=e2e-test-prometheus-sl8fr exec execpod -- /bin/sh -x -c curl -k -s -o /dev/null -w '%{http_code}' \"https://thanos-querier.openshift-monitoring.svc:9091\"'\nI0729 21:26:45.502428 23681 builder.go:146] stderr: \"+ curl -k -s -o /dev/null -w '%{http_code}' https://thanos-querier.openshift-monitoring.svc:9091\\n\"\nI0729 21:26:45.502542 23681 builder.go:147] stdout: \"401\"\n  STEP: verifying a service account token is able to authenticate @ 07/29/25 21:26:45.502\n  STEP: verifying a service account token is able to access the Prometheus API @ 07/29/25 21:26:46.887\n  STEP: verifying all expected jobs have a working target @ 07/29/25 21:26:48.843\n  STEP: verifying all targets are exposing metrics over secure channel @ 07/29/25 21:26:49.139\nI0729 21:26:52.721957 23681 client.go:674] Deleted {user.openshift.io/v1, Resource=users  e2e-test-prometheus-sl8fr-user}, err: \u003cnil\u003e\nI0729 21:26:53.015156 23681 client.go:674] Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-prometheus-sl8fr}, err: \u003cnil\u003e\nI0729 21:26:53.309659 23681 client.go:674] Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  sha256~jAh3vZjQkmD4SoSNKKzQmE9tAXFCIT60s6AE3sUbPpQ}, err: \u003cnil\u003e\n  STEP: Destroying namespace \"e2e-test-prometheus-sl8fr\" for this suite. @ 07/29/25 21:26:53.312\n"
  }
]%                                                                                                                                                                                                          dinesh@Dineshs-MacBook-Pro origin %

Copy link
Member

@wking wking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the Kube API server load of the TokenReview calls for now, with the future off ramp being investigated in OTA-1594.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 5, 2025
Copy link
Contributor

openshift-ci bot commented Aug 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongkailiu, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD a7d6e43 and 2 for PR HEAD 833a491 in total

1 similar comment
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD a7d6e43 and 2 for PR HEAD 833a491 in total

@wking
Copy link
Member

wking commented Aug 6, 2025

tech-preview failure:

: [sig-storage] OCP CSI Volumes [Driver: csi-hostpath-groupsnapshot] [OCPFeatureGate:VolumeGroupSnapshot] [Testpattern: (delete policy)] volumegroupsnapshottable [Feature:volumegroupsnapshot] VolumeGroupSnapshottable should create snapshots for multiple volumes in a pod [Suite:openshift/conformance/parallel] [Suite:k8s]
Run #0: Failed	15m33s
{  fail [k8s.io/kubernetes/test/e2e/storage/testsuites/volume_group_snapshottable.go:173]: Interrupted by User}

is unrelated to this pull request.

/override ci/prow/e2e-aws-ovn-techpreview

Copy link
Contributor

openshift-ci bot commented Aug 6, 2025

@wking: Overrode contexts on behalf of wking: ci/prow/e2e-aws-ovn-techpreview

In response to this:

tech-preview failure:

: [sig-storage] OCP CSI Volumes [Driver: csi-hostpath-groupsnapshot] [OCPFeatureGate:VolumeGroupSnapshot] [Testpattern: (delete policy)] volumegroupsnapshottable [Feature:volumegroupsnapshot] VolumeGroupSnapshottable should create snapshots for multiple volumes in a pod [Suite:openshift/conformance/parallel] [Suite:k8s]
Run #0: Failed	15m33s
{  fail [k8s.io/kubernetes/test/e2e/storage/testsuites/volume_group_snapshottable.go:173]: Interrupted by User}

is unrelated to this pull request.

/override ci/prow/e2e-aws-ovn-techpreview

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

openshift-ci bot commented Aug 6, 2025

@hongkailiu: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 9de00ba into openshift:main Aug 6, 2025
17 checks passed
@openshift-ci-robot
Copy link
Contributor

@hongkailiu: Jira Issue OCPBUGS-57585: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh.

Jira Issue OCPBUGS-57585 has not been moved to the MODIFIED state.

In response to this:

The /metrics is protected by authHandler introduced from this pull.

authHandler allows only for requests with the bearer token associated with system:serviceaccount:openshift-monitoring:prometheus-k8s.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@wking
Copy link
Member

wking commented Aug 6, 2025

/jira refresh
/cherry-pick release-4.19

@openshift-ci-robot
Copy link
Contributor

@wking: Jira Issue OCPBUGS-57585: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-57585 has been moved to the MODIFIED state.

In response to this:

/jira refresh
/cherry-pick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@wking: new pull request created: #1222

In response to this:

/jira refresh
/cherry-pick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: cluster-version-operator
This PR has been included in build cluster-version-operator-container-v4.20.0-202508060745.p0.g9de00ba.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants