Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/datadog] metrics exporter: 413 - Request Entity Too Large #17566

Closed
hanikesn opened this issue Jan 13, 2023 · 8 comments · Fixed by #17877
Closed

[exporter/datadog] metrics exporter: 413 - Request Entity Too Large #17566

hanikesn opened this issue Jan 13, 2023 · 8 comments · Fixed by #17877
Assignees
Labels
bug Something isn't working data:metrics Metric related issues exporter/datadog Datadog components priority:p2 Medium

Comments

@hanikesn
Copy link

Component(s)

exporter/datadog

What happened?

After updating to the latest collector, which uses the native datadog client we're seeing a lot of failures to export metrics with: max elapsed time expired 413 Request Entity Too Large

Collector version

0.69.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@hanikesn hanikesn added bug Something isn't working needs triage New item requiring triage labels Jan 13, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@mx-psi mx-psi added exporter/datadog Datadog components data:metrics Metric related issues priority:p2 Medium and removed needs triage New item requiring triage labels Jan 13, 2023
@songy23
Copy link
Member

songy23 commented Jan 13, 2023

Hi @hanikesn are you batching the metrics? If so can you share the config of your batch processor?

@hanikesn
Copy link
Author

Hi @hanikesn are you batching the metrics? If so can you share the config of your batch processor?

We have a batch processor with the following configuration:

  batch:
    timeout: 1s

@songy23
Copy link
Member

songy23 commented Jan 13, 2023

This probably has to do with the V2 metrics data model change, e.g. in V1 host names are plain strings while in V2 they are in a nested proto message.

Two possible mitigations:

  1. Set a smaller threshold for send_batch_size in batchprocessor, this should hopefully reduce the payload size sent to datadogexporter.
  2. If the above doesn't work, fall back to using the Zorkian library by adding CLI flag --feature-gates=-exporter.datadogexporter.metricexportnativeclient in your docker file or when starting the collector

@hanikesn
Copy link
Author

I configured the batch settings like here:

  batch:
    # Datadog APM Intake limit is 3.2MB. Let's make sure the batches do not
    # go over that.
    send_batch_max_size: 1000
    send_batch_size: 100
    timeout: 10s

And still getting the same error.

@songy23
Copy link
Member

songy23 commented Jan 17, 2023

From https://docs.datadoghq.com/api/latest/metrics/#submit-metrics:

  • In V2 metrics export the payload size limit is 500KB uncompressed, or 5MB after decompression
  • In V1 the limit is 3.2MB uncompressed, or 62MB after decompression

#16776 switched metric export to call V2 APIs (which is recommended) instead of V1. While the doc you mentioned still refers to the V1 limit.

Can you try further lowering the send_batch_size for datadog exporter? You might want to have a separate batch processor dedicated for datadog exporter if other exporters expect a larger batch size, e.g.

processors:
  batch:
    timeout: 1s
  batch/2:
    send_batch_max_size: 100
    send_batch_size: 10
    timeout: 10s
...
service:
  pipelines:
    metrics:
      receivers: ...
      processors: [batch/2]
      exporters: [datadog]

@hanikesn
Copy link
Author

Can you try further lowering the send_batch_size for datadog exporter?

Thanks @songy23, looks like your suggested configuration works well and we couldn't detect any more failures even during peak hours. Please feel free to close the issue, but I suggest updating the docs here and on Datadogs side as it seems quite likely for more people to hit this issue with the latest version.

@songy23
Copy link
Member

songy23 commented Jan 19, 2023

Thank you @hanikesn, glad to know it works for you. I'll update the readme in this repo and follow up with updating this page: https://docs.datadoghq.com/opentelemetry/otel_collector_datadog_exporter/?tab=onahost#2-configure-the-datadog-exporter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data:metrics Metric related issues exporter/datadog Datadog components priority:p2 Medium
Projects
None yet
3 participants