You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The loadbalancing exporter in the OpenTelemetry Collector Contrib package fails to start the collector when using the k8s resolver if it fails to watch/list endpoints. There are continuous errors for the same.
This can occur in many scenarios.
For example:
Missing Role/RoleBinding: The collector pod doesn't have the required role or role binding to access Kubernetes API resources.
Incorrect Service Name: The k8s resolver configuration within the loadbalancing exporter specifies an invalid service name.
In both cases, the k8s resolver fails to retrieve the target endpoint for trace export, leading to the collector startup failure.
Steps to Reproduce
Deploy an OpenTelemetry collector with the loadbalancing exporter configured to use the k8s resolver.
Option 1: Missing Permissions:
Do not assign any role or role binding to the collector pod service account.
Option 2: Incorrect Service Name:
Configure the k8s resolver in the loadbalancing exporter with a non-existent service name.
Start the collector deployment.
Expected Result
The OpenTelemetry collector should start successfully even if the k8s resolver initially fails to retrieve the target endpoint due to missing permissions or an incorrect service name. The collector should continue attempting to connect to the k8s API in the background for exporting traces. But other pipelines should function as expected.
Actual Result
The collector fails to start and becomes unavailable for export of other telemetry data in pipeline
2024-06-28T09:37:04.914Z info service@v0.103.0/service.go:115 Setting up own telemetry...
2024-06-28T09:37:04.914Z info service@v0.103.0/telemetry.go:96 Serving metrics {"address": ":8888", "level": "Normal"}
2024-06-28T09:37:04.914Z info exporter@v0.103.0/exporter.go:280 Development component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-06-28T09:37:04.914Z info exporter@v0.103.0/exporter.go:280 Development component. May change in the future. {"kind": "exporter", "data_type": "traces", "name": "debug"}
2024-06-28T09:37:04.915Z info memorylimiter/memorylimiter.go:160 Using percentage memory limiter {"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "total_memory_mib": 15976, "limit_percentage": 80, "spike_limit_percentage": 25}
2024-06-28T09:37:04.915Z info memorylimiter/memorylimiter.go:77 Memory limiter configured {"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "limit_mib": 12781, "spike_limit_mib": 3994, "check_interval": 10}
2024-06-28T09:37:04.915Z warn jaegerreceiver@v0.103.0/factory.go:49 jaeger receiver will deprecate Thrift-gen and replace it with Proto-gen to be compatbible to jaeger 1.42.0 and higher. See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18485 for more details. {"kind": "receiver", "name": "jaeger", "data_type": "traces"}
2024-06-28T09:37:04.915Z info service@v0.103.0/service.go:182 Starting otelcol-k8s... {"Version": "0.103.1", "NumCPU": 10}
2024-06-28T09:37:04.915Z info extensions/extensions.go:34 Starting extensions...
2024-06-28T09:37:04.915Z info extensions/extensions.go:37 Extension is starting... {"kind": "extension", "name": "health_check"}
2024-06-28T09:37:04.915Z info healthcheckextension@v0.103.0/healthcheckextension.go:32 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"10.1.1.32:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"CompressionAlgorithms":null,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2024-06-28T09:37:04.915Z info extensions/extensions.go:52 Extension started. {"kind": "extension", "name": "health_check"}
2024-06-28T09:37:04.915Z info otlpreceiver@v0.103.0/otlp.go:102 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "10.1.1.32:4317"}
2024-06-28T09:37:04.915Z info otlpreceiver@v0.103.0/otlp.go:152 Starting HTTP server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "10.1.1.32:4318"}
W0628 09:37:04.918760 1 reflector.go:539] k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: failed to list *v1.Endpoints: endpoints "tailsampling-svc" is forbidden: User "system:serviceaccount:default:my-opentelemetry-collector" cannot list resource "endpoints"in API group ""in the namespace "tailsampler"
E0628 09:37:04.918786 1 reflector.go:147] k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "tailsampling-svc" is forbidden: User "system:serviceaccount:default:my-opentelemetry-collector" cannot list resource "endpoints"in API group ""in the namespace "tailsampler"
W0628 09:37:06.037354 1 reflector.go:539] k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: failed to list *v1.Endpoints: endpoints "tailsampling-svc" is forbidden: User "system:serviceaccount:default:my-opentelemetry-collector" cannot list resource "endpoints"in API group ""in the namespace "tailsampler"
E0628 09:37:06.037425 1 reflector.go:147] k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints "tailsampling-svc" is forbidden: User "system:serviceaccount:default:my-opentelemetry-collector" cannot list resource "endpoints"in API group ""in the namespace "tailsampler"
Additional context
No response
The text was updated successfully, but these errors were encountered:
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Component(s)
exporter/loadbalancing
What happened?
Description
The loadbalancing exporter in the OpenTelemetry Collector Contrib package fails to start the collector when using the k8s resolver if it fails to watch/list endpoints. There are continuous errors for the same.
This can occur in many scenarios.
For example:
Missing Role/RoleBinding: The collector pod doesn't have the required role or role binding to access Kubernetes API resources.
Incorrect Service Name: The k8s resolver configuration within the loadbalancing exporter specifies an invalid service name.
In both cases, the k8s resolver fails to retrieve the target endpoint for trace export, leading to the collector startup failure.
Steps to Reproduce
Deploy an OpenTelemetry collector with the loadbalancing exporter configured to use the k8s resolver.
Option 1: Missing Permissions:
Do not assign any role or role binding to the collector pod service account.
Option 2: Incorrect Service Name:
Configure the k8s resolver in the loadbalancing exporter with a non-existent service name.
Start the collector deployment.
Expected Result
The OpenTelemetry collector should start successfully even if the k8s resolver initially fails to retrieve the target endpoint due to missing permissions or an incorrect service name. The collector should continue attempting to connect to the k8s API in the background for exporting traces. But other pipelines should function as expected.
Actual Result
The collector fails to start and becomes unavailable for export of other telemetry data in pipeline
Collector version
v0.95.0
Environment information
Kubernetes cluster
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: