-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[processor/resourcedetection] The ec2 detector does not retry on initial failure #35936
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Is this specific to ec2, or does this apply to other resource detectors as well? |
We have only seen this in ec2, but I assume it's something that potentially can be seen in other detectors. The problem is that we assume that not all configured detectors are expected to provide data. We allow not applicable detectors to silently fail at the start time. I believe the solution would be an additional backoff retry configuration option. If that's enabled, the processor will keep retrying. Maybe we should block the collector from starting until it succeeds or retries are exhausted. |
Specifically in this case, the code triggered is: opentelemetry-collector-contrib/processor/resourcedetectionprocessor/internal/aws/ec2/ec2.go Lines 80 to 83 in 1d12566
|
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Component(s)
processor/resourcedetection/internal/aws/ec2
What happened?
Description
The ec2 resource detection processor attempts to connect to the metadata endpoint of AWS to get ec2 machine information on startup of the collector. If the connection to the metadata endpoint fails, the detector gives up. In some cases, the collector may run ahead of the network being ready on the machine, which may cause the collector to fail to get the information.
Steps to Reproduce
Shut down the network interface of the EC2 AWS instance
Start the collector on the EC2 AWS instance
Notice the ec2 detector reports an error in logs
Start the network interface of the EC2 instance
Expected Result
The collector should eventually add resource attributes with ec2 metadata
Actual Result
The collector never changes the output
Collector version
0.112.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: