Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In AWS EKS optimized AL2023, it can't pull EC2 metadata in the cloudwatch-agent pod. #31843

Closed
kyle504 opened this issue Mar 20, 2024 · 4 comments

Comments

@kyle504
Copy link

kyle504 commented Mar 20, 2024

Component(s)

internal/aws

What happened?

Description

In AL2023, EKS optimized AMI released with EC2 default hop-count as 1. You can check the update note in document note section.

But cloudwatch-agent pod uses opentelemetry to get monitoring data. So cloudwatch-agent pod -> EC2 Node -> AWS Server, it needs hop-count 2. Opentelemetry can't reach to the AWS server.

Steps to Reproduce

  1. Create EKS cluster
  2. Create managed node group with AL2023
  3. Install cloudwatch observability add-on

Expected Result

Performance logs should be collected well to cloudwatch.

Actual Result

Can't collect performance data.

Collector version

v0.89.0

Environment information

Environment

OS: AL2023, cloudwatch-agent:1.300034.1b536
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

2024-03-19T08:06:00.955410089Z stdout F I! imds retry client will retry 1 timesD! should retry true for imds error : RequestError: send request failed
2024-03-19T08:06:01.990573705Z stdout F caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! should retry true for imds error : RequestError: send request failed
2024-03-19T08:06:01.990682581Z stdout F caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! could not get hostname without imds v1 fallback enable thus enable fallback
2024-03-19T08:06:05.132329297Z stdout F E! [EC2] Fetch hostname from EC2 metadata fail: EC2MetadataError: failed to make EC2Metadata request
2024-03-19T08:06:05.132353411Z stdout F
2024-03-19T08:06:05.132357507Z stdout F         status code: 401, request id:
2024-03-19T08:06:06.132963089Z stdout F D! should retry true for imds error : RequestError: send request failed
2024-03-19T08:06:07.181617984Z stdout F caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! should retry true for imds error : RequestError: send request failed
2024-03-19T08:06:07.181653952Z stdout F caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! could not get instance document without imds v1 fallback enable thus enable fallback
2024-03-19T08:06:10.358260881Z stdout F E! [EC2] Fetch identity document from EC2 metadata fail: EC2MetadataRequestError: failed to get EC2 instance identity document
2024-03-19T08:06:10.358298644Z stdout F caused by: EC2MetadataError: failed to make EC2Metadata request
2024-03-19T08:06:10.358303547Z stdout F
2024-03-19T08:06:10.358307135Z stdout F         status code: 401, request id:

Additional context

No response

@kyle504 kyle504 added bug Something isn't working needs triage New item requiring triage labels Mar 20, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@kyle504
Copy link
Author

kyle504 commented Mar 20, 2024

I found it can be fixed by changing EC2 Node hop-count from 1 to 2. But we should use LaunchTemplate or any other method to fix it. Is there any plan to change it?

@atoulme atoulme removed the needs triage New item requiring triage label Mar 30, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label May 30, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants