Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

health_check extension mandatory #242

Closed
Tobrek opened this issue Jun 24, 2022 · 9 comments · Fixed by #295
Closed

health_check extension mandatory #242

Tobrek opened this issue Jun 24, 2022 · 9 comments · Fixed by #295
Labels
bug Something isn't working chart:collector Issue related to opentelemetry-collector helm chart

Comments

@Tobrek
Copy link

Tobrek commented Jun 24, 2022

Currently the health_check extension is kind of mandatory to be enabled, because the livenessProbe and readinessProbe can't be disabled, when the health_check extension should not be used. But without the extension the checks fail and the running pod is never healthy.

If i set in values.yaml

mode: deployment

config:
  extensions: null
  service:
    extensions: null

i get (regardless if contrib is used or not)

$ kubectl -n ingress-nginx describe pod otel-collector-64795fbc56-9v2p2 
Name:         otel-collector-64795fbc56-9v2p2
Namespace:    ingress-nginx
Priority:     0
Node:         minikube/192.168.49.2
Start Time:   Fri, 24 Jun 2022 10:48:21 +0200
Labels:       app.kubernetes.io/instance=otel-collector
              app.kubernetes.io/name=otel-collector
              component=standalone-collector
              pod-template-hash=64795fbc56
Annotations:  checksum/config: e4de06b3748619ed857f1e1761a488d526d2caa954be599b8f3e9d3e3be88dfd
Status:       Running
IP:           172.17.0.4
IPs:
  IP:           172.17.0.4
Controlled By:  ReplicaSet/otel-collector-64795fbc56
Containers:
  opentelemetry-collector:
    Container ID:  docker://d2d880f5d8118a2c5beabf361fc5997de9ffdad1e930ea989ca1150d03d5165e
    Image:         otel/opentelemetry-collector-contrib:0.53.0
    Image ID:      docker-pullable://otel/opentelemetry-collector-contrib@sha256:253b2b98b6ec4cbe622039e8796b12dd5e9e013b0fcba495c21bb8e6a077f1d4
    Ports:         8888/TCP, 4317/TCP, 4318/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      /otelcol-contrib
      --config=/conf/relay.yaml
    State:          Running
      Started:      Fri, 24 Jun 2022 10:48:21 +0200
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     1
      memory:  2Gi
    Liveness:   http-get http://:13133/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:13133/ delay=0s timeout=1s period=10s #success=1 #failure=3
[...]
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  11s               default-scheduler  Successfully assigned ingress-nginx/otel-collector-64795fbc56-9v2p2 to minikube
  Normal   Pulled     11s               kubelet            Container image "otel/opentelemetry-collector-contrib:0.53.0" already present on machine
  Normal   Created    11s               kubelet            Created container opentelemetry-collector
  Normal   Started    11s               kubelet            Started container opentelemetry-collector
  Warning  Unhealthy  1s (x3 over 10s)  kubelet            Readiness probe failed: Get "http://172.17.0.4:13133/": dial tcp 172.17.0.4:13133: connect: connection refused
  Warning  Unhealthy  1s                kubelet            Liveness probe failed: Get "http://172.17.0.4:13133/": dial tcp 172.17.0.4:13133: connect: connection refused

Allowing just the health_check extension

config:
  extensions: 
    health_check: {}

  service:
    extensions: 
      - health_check

everything works fine.

(Allowing memory_ballast is crashing the application - "Error: failed to start extensions: path '/docker/f82533b[...]' is not a descendant of mount point root '/docker/f82533b[...]/kubelet' and cannot be exposed from '/sys/fs/cgroup/rdma/kubelet'"

@TylerHelmuth TylerHelmuth added the chart:collector Issue related to opentelemetry-collector helm chart label Jun 28, 2022
@TylerHelmuth
Copy link
Member

I think you are correct. If the health check extension is not enabled then the hardcoded probes will not work. We should add a way to modify the probes so that disabling the health check extension is possible. Would you like to work on that?

@TylerHelmuth
Copy link
Member

@dmitryax I think this scenario will have some influence on #158. If we allow .Values.config to override all presets, then we need a way to handle the probes accordingly. Almost makes me want to go back to presets per component 😭.

@TylerHelmuth TylerHelmuth added the bug Something isn't working label Jun 28, 2022
@dmitryax
Copy link
Member

What is the problem with this? health_check must be mandatory in k8s. If user wants to override the config, they must take care of that extension in their custom config. I don't see any problem here.

@TylerHelmuth
Copy link
Member

health_check must be mandatory in k8s.

Is this for sure? Can't you technically deploy something without liveliness and readiness probes? Probably not a good idea but I think you can.

If we want to prevent bad practices and enforce the best practices of liveliness and readiness probes via the chart then we should also make it clear somewhere that health_check is a required extension and should not be removed.

@dmitryax
Copy link
Member

dmitryax commented Jun 28, 2022

Can't you technically deploy something without liveliness and readiness probes? Probably not a good idea but I think you can.

You can, but why? Is there a real life reason for that?

@TylerHelmuth
Copy link
Member

@Tobrek do you have a reason for wanting to deploy without health checks?

@Tobrek
Copy link
Author

Tobrek commented Jun 29, 2022

@Tobrek do you have a reason for wanting to deploy without health checks?

I had no specific reason to do so. The manually installed collector before had no health check and i just tried to "copy" the deployment.yaml from before with the helm chart.

What is the problem with this? [...] If user wants to override the config, they must take care of that extension in their custom config. [...]

How can i add the extension outside the helm chart? Just as curious question.

Would you like to work on that?

I can do / try that, when you tell me what's your preferred solution:
a) removing the extension also disables the probes
b) health_check extension is mandatory and can't be disabled
c) leave it as it is and the user has to take care about the extension, when excluding it. (Base on the answer of my previous question).
For me it is not important which solution is the best. My the point is just that you can create an invalid configuration without any information why.

@TylerHelmuth
Copy link
Member

Unless there is a good use case for the opposite, I would say we fall into "the health_check extension is mandatory" scenario. We can add something to the values.yaml comments to indicated that the healthcheckextension should never be disabled.

@Hades32
Copy link

Hades32 commented Apr 14, 2023

Why is this closed by adding comment? Wouldn't the right solution be to merge the given config with an empty health check config?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working chart:collector Issue related to opentelemetry-collector helm chart
Projects
None yet
4 participants