-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Amazon's IMDSv2 #249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's not ready for review, but just found it in the review queue in email, so just had a quick look.
v3/internal/utilization/aws.go
Outdated
ret = nil | ||
err = unexpectedAWSErr{e: fmt.Errorf("error contacting AWS IMDSv2 token endpoint, %v", err)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should return here, if an error happened, right?
ret = nil | |
err = unexpectedAWSErr{e: fmt.Errorf("error contacting AWS IMDSv2 token endpoint, %v", err)} | |
return nil, unexpectedAWSErr{e: fmt.Errorf("error contacting AWS IMDSv2 token endpoint, %v", err)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, maybe you could create a request to awsEndpoint
before calling getAWSToken
and add the token to the request's headers only if err != nil
. This would make IMDSv2 optional in case someone wants to stick with IMDSv1 (if that is even possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing this! After thinking harder about it, we shouldn't throw an unExpectedAWSErr at all, we're expecting a timeout unless they're in AWS. Fixing this and the test.
A more thorough implementation would raise errors on other HTTP errors, but I'm deferring that work in the interest of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the extra request to awsEndpoint, I'd rather avoid making three calls to AWS and increasing the overhead by another 0.5 second on startup. As of now, IMDSv2 endpoints are always available -- the only control the customer has is to turn IMDSv1 off to help their security posture.
v3/internal/utilization/aws.go
Outdated
awsTokenEndpointPath = "/latest/api/token" | ||
awsEndpoint = "http://" + awsHostname + awsEndpointPath | ||
awsTokenEndpoint = "http://" + awsHostname + awsTokenEndpointPath | ||
awsTokenTTL = "60" // seconds this AWS utiliation session will last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awsTokenTTL = "60" // seconds this AWS utiliation session will last | |
awsTokenTTL = "60" // seconds this AWS utilization session will last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
This implements a newer method of authentication to AWS instance metadata endpoints.
Under IMDSv1, the agent simply called to an endpoint and received JSON with information about the instance size, availability zone, and other facts.
This newer IMDSv2 implements a session protocol. The agent calls two endpoints: one to get a session token that's valid for a specified time, and then the original endpoint with the session token to get the instance information.
As of this writing, AWS normally accepts both IMDSv1 and IMDSv2. A customer can tell AWS to only allow IMDSv2, so we're fine if we simply perform IMDSv2 calls exclusively.
Tests for these changes will need to be higher-level integration tests, since they rely on AWS services. Let me know if you think otherwise!
Links
Amazon's IMDSv2 Documentation
New Relic Java Agent Implementation
Details
One decision still in flight is what the timeout values should be for each call to the two endpoints. Since the original timeout was one second, I've split the difference and set the timeout for both calls to 500 milliseconds. These timeouts could be reduced, since the endpoints are reported to be very fast by the documentation above (< 10ms). Other New Relic language agents have other implementations, including a 100 millisecond timeout, and a back-off scheme using 15, 30, 60, 120, and 300 milliseconds.