Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the EC2 Metadata Server #115

Closed

Conversation

sjperkins
Copy link
Contributor

TODO

  • Support expiration of credentials. EC2 Metadata server returns access key, secret key and a session token that expires.
  • Handle more failure cases in the Mock

@sjperkins sjperkins marked this pull request as draft September 21, 2023 16:52
@sjperkins
Copy link
Contributor Author

It might be simplest to add an expiration date to the AwsCredentials struct, default-initialised toabsl::InfiniteFuture.

/// Holds S3 credentials
///
/// Contains the access key, secret key and session token.
/// An empty access key implies anonymous access,
/// while the presence of a session token implies the use of
/// short-lived STS credentials
/// https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html
struct AwsCredentials {
  /// AWS_ACCESS_KEY_ID
  std::string access_key;
  /// AWS_SECRET_KEY_ID
  std::string secret_key;
  /// AWS_SESSION_TOKEN
  std::string session_token;
  bool IsAnonymous() const { return access_key.empty(); }
};

@sjperkins
Copy link
Contributor Author

Some of the reference url's were useful for identifying edge cases, but may provide too much detail.

// Requests to the above server block outside AWS
// Configure a timeout small enough not to degrade performance outside AWS
// but large enough inside AWS
static constexpr absl::Duration kConnectTimeout = absl::Milliseconds(200);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include any directly named types. I notice at least, in this file:

string
absl/time
absl/strings/str_cat
tensorstore/json_binding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I think the existing tensorstore/internal/json_binding/* imports covers the last case.

Copy link
Collaborator

@laramiel laramiel Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot more includes missing.

Look at the using declarations, for example. Anything that you spell out, such as 'tensorstore::Result' needs an import in each file where it is spelled (include-what-you-use); we don't rely on transitive includes for any of those cases. We only permit transitive includes for non-spelled field access and auto type use.

json_binding is different, though. Those includes are needed to satisfy the need for type-specializations to be visible where they are used, such as bindings for absl::Time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for pointing this out. I've gone through the file on a IWYU basis and adjusted the #includes accordingly.

jb::Member("AccessKeyId", jb::Projection(&EC2CredentialsResponse::AccessKeyId)),
jb::Member("SecretAccessKey", jb::Projection(&EC2CredentialsResponse::SecretAccessKey)),
jb::Member("Token", jb::Projection(&EC2CredentialsResponse::Token)),
jb::Member("Expiration", jb::Projection(&EC2CredentialsResponse::Expiration))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that, except for Code, all of these should be optional to encourage success in json parsing, when Code != "Success".

A sample from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials

{
  "Code" : "Success",
  "LastUpdated" : "2012-04-26T16:39:16Z",
  "Type" : "AWS-HMAC",
  "AccessKeyId" : "ASIAIOSFODNN7EXAMPLE",
  "SecretAccessKey" : "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
  "Token" : "token",
  "Expiration" : "2017-05-17T15:09:54Z"
}

I'll try find the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made all members optional except except Code and added a test case for the Code == SomethingOtherThanSuccess case.

std::shared_ptr<internal_http::HttpTransport> transport)
: transport_(std::move(transport)) {}

Result<AwsCredentials> GetCredentials() override;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we may want to convert this from Result<AwsCredentials> GetCredentials() to Future<AwsCredentials> GetCredentials().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do so. I guess this would imply returning a Future in the AwsCredentialProvider interface too?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are separate issues, but perhaps. Also, defer that work if you want until we get this done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we may want to convert this from Result<AwsCredentials> GetCredentials() to Future<AwsCredentials> GetCredentials().

I guess this so that the GetCredentials here can be converted into chained futures so as to not block the thread on which futures are submitted?

// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another include-order detail. If there is a .h file which directly correlates with the .cc file (a header with the declarations and the same name), then we include it before anything else.

@laramiel
Copy link
Collaborator

laramiel commented Oct 2, 2023

A note on ordering here. I think that doing finishing this prior to #119 is the way to go, actually. Then I think that it might better indicate if the refactoring there pulls it's weight--I suspect that some of it isn't very useful.

@laramiel
Copy link
Collaborator

laramiel commented Oct 4, 2023

I'm preparing a submit for this. I've added the following in aws_credential_provider.h

class EC2MetadataCredentialProvider : public AwsCredentialProvider {
 public:
   /// ... public members ...
 private:
  std::shared_ptr<internal_http::HttpTransport> transport_;
  absl::Mutex mutex_;
  AwsCredentials credentials_ ABSL_GUARDED_BY(mutex_);
  absl::Time timeout_ ABSL_GUARDED_BY(mutex_) = absl::InfinitePast();
};

// Returns whether the EC2 Metadata Server is available.
bool IsEC2MetadataServiceAvailable(internal_http::HttpTransport& transport);

And in aws_credential_provider.cc:

// Obtain a metadata token for communicating with the api server.
Result<absl::Cord> GetEC2ApiToken(internal_http::HttpTransport& transport) {
  auto token_request = HttpRequestBuilder("POST", kTokenUrl)
                           .AddHeader(kTokenTtlHeader)
                           .BuildRequest();

  TENSORSTORE_ASSIGN_OR_RETURN(
      auto token_response,
      transport
          .IssueRequest(token_request, {}, absl::Seconds(1), kConnectTimeout)
          .result());

  TENSORSTORE_RETURN_IF_ERROR(HttpResponseCodeToStatus(token_response));
  return std::move(token_response.payload);
}

}  // namespace

// Returns whether the EC2 Metadata Server is available.
bool IsEC2MetadataServiceAvailable(internal_http::HttpTransport& transport) {
  return GetEC2ApiToken(transport).ok();
}

// ... comment ...
Result<AwsCredentials> EC2MetadataCredentialProvider::GetCredentials() {
  absl::MutexLock l(&mutex_);
  if (absl::Now() + absl::Seconds(60) < timeout_) {
    return credentials_;
  }
  auto default_timeout = absl::Now() + kDefaultTimeout;

  // Obtain an API token for communicating with the EC2 Metadata server
  TENSORSTORE_ASSIGN_OR_RETURN(auto api_token, GetEC2ApiToken(*transport_));

@sjperkins
Copy link
Contributor Author

9ec5bc2 might be useful as it follows the go logic. It also handles the case of .../iam/security-credentials returning multiple iam roles separated by newline -- the first one is chosen.

copybara-service bot pushed a commit that referenced this pull request Oct 4, 2023
This provides initial support for S3 metadata server credentials.

#115

PiperOrigin-RevId: 570735987
Change-Id: Ic117a710665abe1e60dfdbb86692322f7f528617
@laramiel
Copy link
Collaborator

laramiel commented Oct 4, 2023

Ok, now you can rebase and move the ec2 metadata server work to #119

@sjperkins
Copy link
Contributor Author

Ok, now you can rebase and move the ec2 metadata server work to #119

Thanks. Will take a look.

@sjperkins sjperkins closed this Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants