[exporter/datadogexporter] EC2MetadataError: failed to make EC2Metadata request #22807

inigohu · 2023-05-26T10:17:04Z

Component(s)

exporter/datadog

What happened?

Description

I get this warning message when the collector starts:

WARN: failed to get session token, falling back to IMDSv1: 404 Not Found: Not Found
status code: 404, request id:
caused by: EC2MetadataError: failed to make EC2Metadata request Not Found status code: 404, request id:

I'm not running on AWS so I don't understand why that warning is raised.

Steps to Reproduce

I can't reproduce it locally

Collector version

v0.78.0

Environment information

Environment

OS: Google Cloud Platform (GKE autopilot 1.24.11-gke.1000)

OpenTelemetry Collector configuration

receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
    processors:
      batch:
    exporters:
      googlecloud:
        project: "my-project"
      datadog:
        api:
          key: <DD_API_KEY>
          site: datadoghq.eu
        metrics:
          sums:
            cumulative_monotonic_mode: to_delta
          histograms:
            mode: distributions
            send_aggregation_metrics: true
          resource_attributes_as_tags: true
        host_metadata:
          enabled: false
    extensions:
      health_check:
        endpoint: 0.0.0.0:13133
    service:
      extensions: [health_check]
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [googlecloud]
        metrics:
          receivers: [otlp]
          processors: [batch]
          exporters: [datadog]

Log output

info    service/telemetry.go:104    Setting up own telemetry...       
info    service/telemetry.go:127    Serving Prometheus metrics    {"address": ":8888", "level": "Basic"}

WARN: failed to get session token, falling back to IMDSv1: 404 Not Found: Not Found 
status code: 404, request id:  
caused by: EC2MetadataError: failed to make EC2Metadata request   Not Found                                                                                                                                                                           status code: 404, request id: 

info    provider/provider.go:30    Resolved source    {"kind": "exporter", "data_type": "metrics", "name": "datadog", "provider": "system", "source": {"Kind":"host","Identifier":"open-telemetry-9898f74fc-6l5sd"}}
...

Additional context

I am not 100% sure if this warning comes from datadog exporter, but my suspicions point to it. If not, feel free to close it.

github-actions · 2023-05-26T10:17:20Z

Pinging code owners:

exporter/datadog: @mx-psi @gbbr @dineshg13 @liustanley @songy23

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2023-07-26T03:29:22Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

exporter/datadog: @mx-psi @gbbr @dineshg13 @liustanley @songy23
needs: Github issue template generation code needs this to generate the corresponding labels.

See Adding Labels via Comments if you do not have permissions to add labels yourself.

cforce · 2023-09-08T14:06:06Z

same issue here

kevinnoel-be · 2023-10-02T09:46:10Z

Not sure if it's the same issue exactly but we too see the same error when running the collector on a non EKS cluster i.e. GKE. We do not use the datadog exporter at all in this particular collector deployment (seen on v0.76 & v0.84 at least). Here is a simplified config used:

    receivers:
      k8sobjects:
        auth_type: serviceAccount
        objects: # ...

    processors:
      resourcedetection:
        detectors:
          - env
          - gcp
          - eks
          - ec2
          - azure
          - system
        timeout: 2s
        override: false
        system:
          resource_attributes:
            host.id:
              enabled: false

      batch: # ...
      memory_limiter: # ...
      # ...

    extensions: # ...

    exporters:
      logging: # ...
      otlp: # ...

    service:
      telemetry: # ...
      extensions: # ...

      pipelines:
        logs:
          receivers:
            - k8sobjects
          processors:
            - resourcedetection
            - memory_limiter
            - batch
            # ...
          exporters:
            - logging
            # ...

I've ran the collector with debug logs and we can see (I assume as much at least) that this is getting triggered in the resourcedetection processor. Additionally, it breaks the "promise" of collector telemetry logs in JSON format 😅:

Collector debug logs

...
{"level":"info","ts":1696239560.521837,"caller":"internal/resourcedetection.go:125","msg":"began detecting resource information","kind":"processor","name":"resourcedetection","pipeline":"logs"}
{"level":"info","ts":1696239560.5745764,"caller":"gcp/gcp.go:67","msg":"Fallible detector failed. This attribute will not be available.","kind":"processor","name":"resourcedetection","pipeline":"logs","key":"host.name","error":"metadata: GCE metadata \"instance/name\" not defined"}
{"level":"debug","ts":1696239560.5763094,"caller":"eks/detector.go:70","msg":"Unable to identify EKS environment","kind":"processor","name":"resourcedetection","pipeline":"logs","error":"isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" is forbidden: User \"system:serviceaccount:xxxx:xxxx\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"kube-system\""}
{"level":"warn","ts":1696239560.5763583,"caller":"internal/resourcedetection.go:130","msg":"failed to detect resource","kind":"processor","name":"resourcedetection","pipeline":"logs","error":"isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" is forbidden: User \"system:serviceaccount:xxxx:xxxx\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"kube-system\""}
 2023/10/02 09:39:20 WARN: failed to get session token, falling back to IMDSv1: 404 Not Found: Not Found
 	status code: 404, request id: 
 caused by: EC2MetadataError: failed to make EC2Metadata request
 Not Found
 
 	status code: 404, request id: 
{"level":"debug","ts":1696239560.5777507,"caller":"ec2/ec2.go:62","msg":"EC2 metadata unavailable","kind":"processor","name":"resourcedetection","pipeline":"logs","error":"EC2MetadataError: failed to make EC2Metadata request\nNot Found\n\n\tstatus code: 404, request id: "}
{"level":"debug","ts":1696239560.578286,"caller":"azure/azure.go:47","msg":"Azure detector metadata retrieval failed","kind":"processor","name":"resourcedetection","pipeline":"logs","error":"Azure IMDS replied with status code: 404 Not Found"}
{"level":"info","ts":1696239560.578477,"caller":"internal/resourcedetection.go:139","msg":"detected resource information","kind":"processor","name":"resourcedetection","pipeline":"logs","resource":{"cloud.account.id":"xxxx","cloud.platform":"gcp_kubernetes_engine","cloud.provider":"gcp","cloud.region":"xxxx","host.id":"xxxx","host.name":"xxxx","k8s.cluster.name":"xxxx","os.type":"linux"}}
...

github-actions · 2023-12-04T03:29:30Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

exporter/datadog: @mx-psi @gbbr @dineshg13 @liustanley @songy23 @mackjmr

See Adding Labels via Comments if you do not have permissions to add labels yourself.

matej-g · 2023-12-18T14:53:56Z

This seems to be coming from the AWS SDK in https://github.com/aws/aws-sdk-go/blob/394d04f7e36b85532cede3eb815a6a23413b2eaa/aws/ec2metadata/token_provider.go#L68 - this part of the code does not respect the logging decision (since for the AWS client, logging should be off by default). Since in some settings (we have also noticed this in GKE Autopilot users) that call can lead to 404 or 403 responses, and subsequently the log is being printed out to stdout.

Besides trying to fix this in the upstream, we could override the logger with a custom AWS logger that would just discard any logs (and thus be true to the "logging off" level that should be by default). Similarly to other providers, I think the EC2 detector should still fail silently (or with debug log) in case we cannot obtain the metadata (e.g. because we're not running on EC2).

matej-g · 2024-01-09T09:16:09Z

This will be resolved by merging #30341, since it has already been fixed in the upstream

inigohu added bug Something isn't working needs triage New item requiring triage labels May 26, 2023

github-actions bot added the exporter/datadog Datadog components label May 26, 2023

github-actions bot added the Stale label Jul 26, 2023

This was referenced Sep 3, 2023

Weekly Report: 2023-08-27 - 2023-09-03 kevinslin/opentelemetry-collector-contrib#20

Open

Weekly Report: 2023-08-27 - 2023-09-03 kevinslin/opentelemetry-collector-contrib#21

Open

github-actions bot mentioned this issue Sep 6, 2023

Weekly Report: 2023-08-30 - 2023-09-06 kevinslin/opentelemetry-collector-contrib#22

Open

mx-psi added priority:p3 Lowest and removed Stale needs triage New item requiring triage labels Sep 8, 2023

mx-psi mentioned this issue Sep 8, 2023

aws ressource detectors are excuted even if not configured #24072

Closed

github-actions bot added the Stale label Dec 4, 2023

mx-psi added never stale Issues marked with this label will be never staled and automatically removed and removed Stale labels Dec 4, 2023

This was referenced Dec 19, 2023

Token provider in EC2 metadata does not respect the logging decision aws/aws-sdk-go#5116

Closed

[processor/resourcedetection][exporter/datadog] Supress unwanted warning logs coming from EC2 metdata provider #30092

Closed

mx-psi linked a pull request Jan 9, 2024 that will close this issue

fix(deps): update all github.com/aws packages #30341

Merged

1 task

mx-psi closed this as completed in #30341 Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[exporter/datadogexporter] EC2MetadataError: failed to make EC2Metadata request #22807

[exporter/datadogexporter] EC2MetadataError: failed to make EC2Metadata request #22807

inigohu commented May 26, 2023

github-actions bot commented May 26, 2023

github-actions bot commented Jul 26, 2023

cforce commented Sep 8, 2023

kevinnoel-be commented Oct 2, 2023 •

edited

Loading

github-actions bot commented Dec 4, 2023

matej-g commented Dec 18, 2023

matej-g commented Jan 9, 2024

[exporter/datadogexporter] EC2MetadataError: failed to make EC2Metadata request #22807

[exporter/datadogexporter] EC2MetadataError: failed to make EC2Metadata request #22807

Comments

inigohu commented May 26, 2023

Component(s)

What happened?

Description

Steps to Reproduce

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Additional context

github-actions bot commented May 26, 2023

github-actions bot commented Jul 26, 2023

cforce commented Sep 8, 2023

kevinnoel-be commented Oct 2, 2023 • edited Loading

github-actions bot commented Dec 4, 2023

matej-g commented Dec 18, 2023

matej-g commented Jan 9, 2024

kevinnoel-be commented Oct 2, 2023 •

edited

Loading