spanmetrics connector generating extreme grpc traffic #20306

devrimdemiroz · 2023-03-24T16:12:33Z

Component(s)

connector/spanmetrics

What happened?

Description

I replaced the spanmetrics processor config on opentelemetry demo app with the new spanmetrics connector. The otlp grpc receiver observed traffic increased almost 10,000 times. Accordingly calls (previously calls_total) and related spanmetrics also linearly explode. See the screenshots at the bottom.

Steps to Reproduce

Following configuration is used in replacement for spanmetrics processor:

connectors:
  spanmetrics:
      histogram:
        explicit:
          buckets: [ 100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms ]
      dimensions:
        - name: http.method
          default: GET
        - name: http.status_code
      dimensions_cache_size: 1000
      aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"
....

service:
  pipelines:
    traces/spanmetrics:
      receivers: [otlp]
      exporters: [spanmetrics]
    metrics/spanmetrics:
      receivers: [spanmetrics]
      exporters: [prometheus]

Expected Result

The expected result is to be inline with spanmetrics processor runs.

When processor runs:

Actual Result

When connector runs:

Collector version

0.74.0

Environment information

Environment

Images

IMAGE_VERSION=1.3.1
IMAGE_NAME=ghcr.io/open-telemetry/demo

OpenTelemetry Collector configuration

receivers:
  otlp:
    protocols:
      grpc:
      http:
        cors:
          allowed_origins:
            - "http://*"
            - "https://*"
exporters:
  otlp:
    endpoint: "localhost:4317"
    tls:
      insecure: true
  logging:
  prometheus:
    endpoint: "otelcol:9464"
    resource_to_telemetry_conversion:
      enabled: true
    enable_open_metrics: true
connectors:
  spanmetrics:
      histogram:
        explicit:
          buckets: [ 100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms ]
      dimensions:
        - name: http.method
          default: GET
        - name: http.status_code
      dimensions_cache_size: 1000
      aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"

processors:
  batch:
  transform:
    metric_statements:
      - context: metric
        statements:
          - set(description, "Measures the duration of inbound HTTP requests") where name == "http.server.duration"


service:
  pipelines:
    traces/spanmetrics:
      receivers: [otlp]
      exporters: [spanmetrics]
    metrics/spanmetrics:
      receivers: [spanmetrics]
      exporters: [prometheus]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp]
    metrics:
      receivers: [otlp]
      processors: [transform, batch]
      exporters: [prometheus]

Log output

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2023-03-24T16:12:55Z

Pinging code owners:

connector/spanmetrics: @albertteoh @kovrus

See Adding Labels via Comments if you do not have permissions to add labels yourself.

kovrus · 2023-03-24T18:29:23Z

@devrimdemiroz The reason is #19216, spanmetrics generates metrics from spans per resource scope now so the number of generated replicas will grow with the number of resource scopes. I've opened a PR to toggle on/off this functionality or filter resource attributes #19467 but we decided to close it because this can be achieved with the transform processor's keep_keys function.

devrimdemiroz · 2023-03-24T22:49:35Z

@kovrus, I truly appreciate your quick response! If you could provide me with a little bit more on the transform processor configuration I need to add, you'll be an absolute time-saver for me. Thanks in advance!

kovrus · 2023-03-27T13:58:27Z

@devrimdemiroz something like that will reduce the number of resource scopes to the number of the services that produce telemetry. If we want to allow the old behavior, one resource scope for everything, we should wrap #19467 up.

...

processors:
  transform:
    trace_statements:
    - context: resource
      statements:
      - keep_keys(attributes, ["service.name"])

...
service:
  pipelines:
    traces/spanmetrics:
      receivers: [otlp]
      processors: [transform]
      exporters: [spanmetrics]
    metrics/spanmetrics:
      receivers: [spanmetrics]
      exporters: [prometheus]
...

devrimdemiroz · 2023-03-28T20:56:52Z

@kovrus, thank you for sharing the precise configuration; it works perfectly. However, I'm unsure if it's absolutely necessary or not. My goal is to create a more straightforward and comprehensible configuration using the new connector config. To achieve this, I've had to add a layer that I haven't used or been familiar with before, which wasn't required by the previous processor. I'm not questioning the importance or potential benefits it may offer; I'm merely curious about the rationale behind some extra lines that aren't immediately clear. Nevertheless, I would recommend including it as part of the default spanmetrics connector config in the documentation. Since the transform config works, I'll consider this matter resolved. Thanks for your time.

kovrus · 2023-03-30T10:45:08Z

@devrimdemiroz yes, we should add a more comprehensive readme for the span metrics connector and its differences from the processor. I've tried to call out that more metrics will be generated when using the connector here, but we probably should provide a better explanation.

The transform processor with keep_keys controls the number of generated metrics resource scopes. There definitely will be cases when resource attributes will have high cardinality and that will result in more metrics generated. I agree that it is not evident from the documentation.

@djaglowski I think, we should probably revisit #19467 and allow users to control what attributes are going to be added to generated metrics resource scopes. Maybe, by default, we can keep resource service.name, service.namespace, and service.isntance.id attributes that would define generated metrics resource scopes (wdyt @gouthamve)? We can use keep_keys for that but then dimensions configuration parameter of the spanmetrics won't work, since resource attributes will be effected by keep_keys.

djaglowski · 2023-03-30T12:24:36Z

My only concern is that we may find ourselves needing to add more and more "transform" capabilities to this connector as well as others. However, if emitting consolidated metrics based on resource attributes appears to be a particularly common case, then I support it.

devrimdemiroz added bug Something isn't working needs triage New item requiring triage labels Mar 24, 2023

github-actions bot added the connector/spanmetrics label Mar 24, 2023

kovrus removed the needs triage New item requiring triage label Mar 24, 2023

devrimdemiroz closed this as completed Mar 28, 2023

devrimdemiroz mentioned this issue Mar 28, 2024

REQUEST: New membership for @devrimdemiroz open-telemetry/community#2027

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spanmetrics connector generating extreme grpc traffic #20306

spanmetrics connector generating extreme grpc traffic #20306

devrimdemiroz commented Mar 24, 2023

github-actions bot commented Mar 24, 2023

kovrus commented Mar 24, 2023 •

edited

Loading

devrimdemiroz commented Mar 24, 2023

kovrus commented Mar 27, 2023

devrimdemiroz commented Mar 28, 2023

kovrus commented Mar 30, 2023

djaglowski commented Mar 30, 2023

spanmetrics connector generating extreme grpc traffic #20306

spanmetrics connector generating extreme grpc traffic #20306

Comments

devrimdemiroz commented Mar 24, 2023

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

Environment

Images

OpenTelemetry Collector configuration

Log output

Additional context

github-actions bot commented Mar 24, 2023

kovrus commented Mar 24, 2023 • edited Loading

devrimdemiroz commented Mar 24, 2023

kovrus commented Mar 27, 2023

devrimdemiroz commented Mar 28, 2023

kovrus commented Mar 30, 2023

djaglowski commented Mar 30, 2023

kovrus commented Mar 24, 2023 •

edited

Loading