internal traces are not generated when telemetry.useOtelWithSDKConfigurationForInternalTelemetry feature gate is set #9715
Description
Describe the bug
Following the instructions outlined in the resources below to enable this feature gate and configure the telemetry
service to emit internal spans does not result in internal spans being emitted:
- https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/observability.md#how-we-expose-telemetry
- https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/troubleshooting.md#traces
No errors are logged, but neither are any internal spans emitted.
Steps to reproduce
- Add the
telemetry.useOtelWithSDKConfigurationForInternalTelemetry
feature gate - Add this telemetry service configuration for internal span emission
traces:
processors:
batch:
exporter:
otlp:
protocol: grpc/protobuf
endpoint: <pod IP>:4317
Ensure the necessary OTEL_* env vars are set like OTEL_SERVICE_NAME
and OTEL_EXPORTER_OTLP_TRACES_HEADERS
Then, send some trace traffic to your collector.
What did you expect to see?
Internal spans emitted from the collector.
What did you see instead?
No internal spans emitted, no error behavior either though.
What version did you use?
0.95.0
What config did you use?
---
service:
# For now we only ingest traces. For metrics we use datadog and for logs fluent-bit.
pipelines:
traces/unsampled:
receivers:
- otlp/auth
- otlp/octomesh
processors:
# Ordering matters!
# https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/README.md
# In order for unsampled metrics to be correct, we have unfortunately to process all of the trace data before sampling.
# This ensures that the metrics do not have access to un-redacted attributes.
- memory_limiter
- batch
- transform/octomesh
- groupbyattrs/compaction
- transform/peer-service
- transform/datastores
- redaction/allow-list
- attributes/euii
- transform/error-recording
- transform/resource-allow-list
exporters:
- datadog/connector
traces/sampled:
receivers:
- datadog/connector
processors:
# Ordering matters!
# https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/README.md
- memory_limiter
- probabilistic_sampler
exporters:
- ${env:OTELCOL_TRACE_EXPORTER}
# Enable when debugging locally or set `OTELCOL_TRACE_EXPORTER` to `logging`
# - logging
metrics/unsampled:
receivers:
- prometheus
- otlp/auth
- datadog/connector
processors:
- memory_limiter
- batch
exporters:
- ${env:OTELCOL_METRICS_EXPORTER}
# Enable when debugging locally or set `OTELCOL_METRICS_EXPORTER` to `logging`
# - logging
extensions:
- health_check
- basicauth
telemetry:
logs:
encoding: json
metrics:
level: detailed
# Configure the collector's internal telemetry so that internal spans are emitted
traces:
processors:
batch:
exporter:
otlp:
protocol: grpc/protobuf
endpoint: localhost:4317
extensions:
health_check: {}
basicauth:
htpasswd:
inline: |
${env:OTELCOL_BASIC_AUTH}
# The pipeline details
receivers:
otlp/auth:
protocols:
grpc:
endpoint: 0.0.0.0:4317
auth:
authenticator: basicauth
http:
endpoint: 0.0.0.0:4318
auth:
authenticator: basicauth
otlp/octomesh:
protocols:
grpc:
endpoint: 0.0.0.0:14317
http:
endpoint: 0.0.0.0:14318
# The prometheus receiver scrapes metrics needed for the OpenTelemetry Collector Dashboard.
# https://app.datadoghq.com/dash/integration/30773/opentelemetry-collector-metrics-dashboard
prometheus:
config:
scrape_configs:
- job_name: 'otelcol'
scrape_interval: 10s
static_configs:
- targets: ['0.0.0.0:8888']
processors:
memory_limiter:
check_interval: 1s
# Maximum amount of memory, in MiB, targeted to be allocated by the process heap.
# Note that typically the total memory usage of process will be about 50MiB higher than this value.
# This defines the hard limit.
# https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/memorylimiterprocessor
limit_percentage: 90
spike_limit_percentage: 20
probabilistic_sampler:
hash_seed: 22
sampling_percentage: ${env:OTEL_COL_SAMPLING_PERCENTAGE}
redaction/allow-list:
${file:redaction-allow-list.yaml}
attributes/euii:
${file:attributes-euii.yaml}
transform/octomesh:
${file:octomesh.yaml}
transform/peer-service:
${file:peer-service.yaml}
transform/resource-allow-list:
${file:transform-processor.yaml}
transform/error-recording:
${file:error-recording.yaml}
transform/datastores:
${file:datastores.yaml}
# TODO: Tweak export batch sizes to DD based on this article
# https://docs.datadoghq.com/opentelemetry/otel_collector_datadog_exporter/?tab=kubernetesgateway#2-configure-the-datadog-exporter
batch: {}
# This processor will compact traces by grouping spans by common resource and instrumentation attributes,
# that way subsequent steps in the pipeline will have less data to process.
groupbyattrs/compaction:
connectors:
# The Datadog Connector is a connector component that computes Datadog APM Stats pre-sampling in the event
# that your traces pipeline is sampled using components such as the tailsamplingprocessor or probabilisticsamplerprocessor.
# The sampled pipeline should be duplicated and the datadog connector should be added to the
# pipeline that is not being sampled to ensure that Datadog APM Stats are accurate in the backend.
# See https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/datadogconnector
datadog/connector:
traces:
# The Datadog Connector to _only_ compute stats for the root span local to this trace. That should be `server` and `consumer` spans.
# Hopefully that will be the ingress span.
# https://github.com/DataDog/datadog-agent/blob/main/pkg/trace/traceutil/trace.go#L114
compute_stats_by_span_kind: true
# We will not use the connector for unsampled client metrics here
peer_tags_aggregation: true
peer_tags:
- _dd.base_service
# - amqp.destination
# - amqp.exchange
# - amqp.queue
# - aws.queue.name
# - bucketname
- cassandra.cluster
- db.cassandra.contact.points
- db.couchbase.seed.nodes
- db.hostname
- db.instance
- db.name
- db.system
# - grpc.host
# - hazelcast.instance
- hostname
- host.name
- http.host
- messaging.destination
- messaging.destination.name
- messaging.kafka.bootstrap.servers
- messaging.rabbitmq.exchange
- messaging.system
# - mongodb.db
# - msmq.queue.path
- net.peer.name
- network.destination.name
- peer.hostname
- peer.service
# - queuename
- rpc.service
- rpc.system
- server.address
# - streamname
# - tablename
# - topicname
trace_buffer: 100
exporters:
datadog:
api:
site: datadoghq.com
key: ${env:DD_API_KEY}
traces:
trace_buffer: 100
logging:
verbosity: detailed
sampling_initial: 1
sampling_thereafter: 1
file/no_rotation:
path: /tmp/trace-output/output.json
Environment
Additional context
I was chatting about this in Slack with @codeboten and we both looked through opentelemetry-collector code and found that this feature flag isn't being used to do anything regarding the tracer provider. Looking through code here https://github.com/search?q=repo%3Aopen-telemetry%2Fopentelemetry-collector%20extendedConfig&type=code and I only see it used when the meter reader is initialized. That initialization seems to only use it to output a log statement. @codeboten was not able to get internal traces generated either.
Metadata
Assignees
Labels
Type
Projects
Status
Done