Skip to content

Issues with Prometheus Configuration and Controller Crashes in Helm Chart v1.44.1 #705

Closed
@yuxzhang97

Description

@yuxzhang97

Description

We recently attempted to upgrade our kubernetes-ingress deployment to the latest image and Helm chart version (v1.44.1) and controller version (3.1). However, after deploying to our staging environment, we encountered multiple issues, particularly related to Prometheus configuration.

Observed Behavior

After deploying the updated chart to our staging environment, we noticed the following logs:

2025/03/04 19:16:08 INFO    handler/prometheus.go:128 [transactionID=2de9d597-d3ea-46c2-9c2e-efd12b67cfe2] reload required : creation/modification of prometheus endpoint
2025/03/04 19:16:08 DEBUG   handler/https.go:163 [transactionID=2de9d597-d3ea-46c2-9c2e-efd12b67cfe2] Cannot proceed with SSL Passthrough update, HTTPS is disabled
[NOTICE]   (69) : Reloading HAProxy
[NOTICE]   (69) : Initializing new worker (187)
[NOTICE]   (187) : haproxy version is 3.1.5-076df02
[WARNING]  (187) : config : parsing [/etc/haproxy/haproxy.cfg:60] : a 'monitor fail' rule placed after an 'http-request' rule will still be processed before.
[WARNING]  (187) : config : parsing [/etc/haproxy/haproxy.cfg:79] : a 'monitor fail' rule placed after an 'http-request' rule will still be processed before.
[WARNING]  (187) : frontend 'https' has no 'bind' directive. Please declare it as a backend if this was intended.
[NOTICE]   (69) : Loading success.
[WARNING]  (185) : Proxy healthz stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (185) : Proxy http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (185) : Proxy stats stopped (cumulated conns: FE: 0, BE: 0).
Proxy healthz stopped (cumulated conns: FE: 0, BE: 0).
Proxy http stopped (cumulated conns: FE: 0, BE: 0).
Proxy stats stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (185) : Proxy tcp-8080 stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (185) : Proxy https stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (185) : Proxy github_http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (185) : Proxy kube-dotcom-ingress-staging_default-local-service_http stopped (cumulated conns: FE: 0, BE: 0).
Proxy tcp-8080 stopped (cumulated conns: FE: 0, BE: 0).
Proxy https stopped (cumulated conns: FE: 0, BE: 0).
Proxy github_http stopped (cumulated conns: FE: 0, BE: 0).
Proxy kube-dotcom-ingress-staging_default-local-service_http stopped (cumulated conns: FE: 0, BE: 0).
[NOTICE]   (69) : haproxy version is 3.1.5-076df02
[WARNING]  (69) : Former worker (185) exited with code 0 (Exit)
2025/03/04 19:16:08 DEBUG   haproxy/process/s6-overlay.go:63 [transactionID=2de9d597-d3ea-46c2-9c2e-efd12b67cfe2] [NOTICE]   (69) : Initializing new worker (187)
[NOTICE]   (187) : haproxy version is 3.1.5-076df02
[WARNING]  (187) : config : parsing [/etc/haproxy/haproxy.cfg:60] : a 'monitor fail' rule placed after an 'http-request' rule will still be processed before.
[WARNING]  (187) : config : parsing [/etc/haproxy/haproxy.cfg:79] : a 'monitor fail' rule placed after an 'http-request' rule will still be processed before.
[WARNING]  (187) : frontend 'https' has no 'bind' directive. Please declare it as a backend if this was intended.
[NOTICE]   (69) : Loading success.
[NOTICE]   (69) : haproxy version is 3.1.5-076df02
[WARNING]  (69) : Former worker (185) exited with code 0 (Exit)
2025/03/04 19:16:08 INFO    controller/controller.go:211 [transactionID=2de9d597-d3ea-46c2-9c2e-efd12b67cfe2] HAProxy reloaded
2025/03/04 19:16:08 DEBUG   haproxy/api/api.go:407 [transactionID=2de9d597-d3ea-46c2-9c2e-efd12b67cfe2] Pushing backends as previous successfully applied backends
2025/03/04 19:16:13 INFO    k8s/crs-monitor.go:124  Custom resource definition created, adding CR watcher for Defaults
2025/03/04 19:16:13 INFO    k8s/crs-monitor.go:124  Custom resource definition created, adding CR watcher for Defaults
2025/03/04 19:16:13 INFO    handler/prometheus.go:128 [transactionID=4c9c6daa-7269-4f38-834e-795653a1666b] reload required : creation/modification of prometheus endpoint
2025/03/04 19:16:13 DEBUG   handler/https.go:163 [transactionID=4c9c6daa-7269-4f38-834e-795653a1666b] Cannot proceed with SSL Passthrough update, HTTPS is disabled
[NOTICE]   (69) : Reloading HAProxy
[NOTICE]   (69) : Initializing new worker (189)

Additional trace logs

On start up:

2025/03/04 19:44:54 TRACE   store/events.go:124 [transactionID=380b8f7f-0b83-4d2a-9119-52821f278906] Treating endpoints event {SliceName:prometheus Namespace:kube-dotcom-ingress-staging Service:prometheus Ports:map[http:0xc00317f9f0] Status:ADDED}
2025/03/04 19:44:54 TRACE   store/events.go:128 [transactionID=380b8f7f-0b83-4d2a-9119-52821f278906] service prometheus : endpoints list map[http:{Addresses:map[127.0.0.1:{}] Port:6060}]
2025/03/04 19:44:54 TRACE   store/events.go:133 [transactionID=380b8f7f-0b83-4d2a-9119-52821f278906] service prometheus : number of already existing backend(s) in this transaction for this endpoint: 0
2025/03/04 19:44:54 INFO    handler/prometheus.go:128 [transactionID=380b8f7f-0b83-4d2a-9119-52821f278906] reload required : creation/modification of prometheus endpoint`

Then this in a loop

2025/03/04 19:44:59 TRACE   store/events.go:119 [transactionID=9db59d19-6652-47f8-aff2-4f7ec4219e00] [RUNTIME] [BACKEND] [SERVER] [No change] [EventEndpoints]. No change for ADDED prometheus prometheus
2025/03/04 19:44:59 INFO    handler/prometheus.go:128 [transactionID=9db59d19-6652-47f8-aff2-4f7ec4219e00] reload required : creation/modification of prometheus endpoint

Due to the following repeated log we determined there may be a problem with how prometheus is configured

reload required : creation/modification of prometheus endpoint

Investigation and Attempts to Fix

  • We identified a commit that appears to address the issue. However, it is only available in the nightly build and not in an official release.

  • Deploying the nightly build did not resolve the issue. We continue to see the same behavior.

  • We attempted to disable Prometheus by setting service.enablePorts.promtheus: false, but discovered that the flag is not checked when the PrometheusEndpoint handler is instantiated, meaning the flag has no effect.

Expected Behavior

  • Prometheus should not trigger constant reloads.
  • Disabling Prometheus should work as expected when service.enablePorts.promtheus: false is set.

Request for Assitance

  • Is the fix from this commit to resolve this or is further work needed?
  • Is there an alternative work around to properly disable Prometheus?
  • Is there a way to properly configure Prometheus so there is no longer constant reloads?

values.yaml

nameOverride: kubernetes-ingress-staging
controller:
  ingressClass: haproxytech-unicorn
  ingressClassResource:
    name: haproxytech-unicorn
  config:
    frontend-config-snippet: |-
      acl site_dead nbsrv(github_http) lt 53
      monitor-uri /_upstream_healthz
      monitor fail if site_dead
      option dontlog-normal
    scale-server-slots: "100"
  readinessProbe:
    httpGet:
      path: /_upstream_healthz
  livenessProbe:
    httpGet:
      path: /_upstream_healthz
  extraArgs:
    - --configmap-tcp-services=kube-dotcom-ingress-staging/tcpservices-unicorn
    - --disable-https
    - --disable-ipv6
    - --namespace-whitelist=github-production
  image:
    tag: nightly
  PodDisruptionBudget:
    enable: true
    maxUnavailable: 1

  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "glb-ingress"
    effect: "NoSchedule"

  containerPort:
    http: 8081
    https: 443
  config:
    timeout-connect: "100ms"
    timeout-client: "180s"
    timeout-server: "180s"
    timeout-queue: "120s"
    timeout-tunnel: "180s"
    timeout-client-fin: "5s"
    timeout-server-fin: "5s"
    nbthread: "1"

    load-balance: "leastconn"
    check: "true"
    check-interval: "5s"
    stats-config-snippet: |
      option dontlog-normal

  defaultTLSSecret:
    enabled: false
  lifecycle:
    preStop:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
  extraEnvs:
  - name: GOMAXPROCS
    value: "1"

  readinessProbe:
    failureThreshold: 3
    initialDelaySeconds: 5
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 1
    httpGet:
      port: 8081
      scheme: HTTP

  livenessProbe:
    failureThreshold: 3
    initialDelaySeconds: 120
    periodSeconds: 60
    successThreshold: 1
    timeoutSeconds: 1
    httpGet:
      port: 8081
      scheme: HTTP

  logging:
    level: info
    traffic:
      address: stdout
      format: raw
      facility: daemon

  publishService:
    enabled: false

  replicaCount: 1
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      memory: 1024Mi

  service:
    enablePorts:
      http: false
      https: false
      quic: false
      prometheus: false
    enabled: true
    tcpPorts:
    - name: http-tcp
      port: 80
      targetPort: 8080
    type: NodePort
  serviceMonitor:
    enabled: false
defaultBackend:
  enabled: false

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions