Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudwatch output plugin not working with ec2 instance profiles. #3474

Closed
sbalagopal opened this issue Nov 15, 2017 · 4 comments · Fixed by #3583
Closed

cloudwatch output plugin not working with ec2 instance profiles. #3474

sbalagopal opened this issue Nov 15, 2017 · 4 comments · Fixed by #3583
Labels
area/aws AWS plugins including cloudwatch, ecs, kinesis bug unexpected problem or unintended behavior regression something that used to work, but is now broken
Milestone

Comments

@sbalagopal
Copy link

Bug report

Cloudwatch output plugin is not able to connect to cloudwatch from an ec2 machine with instance-profile that allows the server to do all operations on cloudwatch service. The function call, GetSessionToken is failing when using ec2 instance's session credentials. According the AWS, this is not allowed, "AccessDenied: Cannot call GetSessionToken with session credentials".

Relevant cloudwatch.go:

./telegraf --config telegraf.conf --debug
2017-11-14T07:59:08Z D! Attempting connection to output: cloudwatch
2017-11-14T07:59:08Z E! cloudwatch: Cannot use credentials to connect to AWS : AccessDenied: Cannot call GetSessionToken with session credentials
status code: 403, request id: b169b099-c911-11e7-b9c4-bd9935259c39
2017-11-14T07:59:08Z E! Failed to connect to output cloudwatch, retrying in 15s, error was 'AccessDenied: Cannot call GetSessionToken with session credentials
status code: 403, request id: b169b099-c911-11e7-b9c4-bd9935259c39'
2017-11-14T07:59:23Z E! cloudwatch: Cannot use credentials to connect to AWS : AccessDenied: Cannot call GetSessionToken with session credentials
status code: 403, request id: ba5c7d38-c911-11e7-b9c4-bd9935259c39
2017-11-14T07:59:23Z E! AccessDenied: Cannot call GetSessionToken with session credentials
status code: 403, request id: ba5c7d38-c911-11e7-b9c4-bd9935259c39

System info:

Telegraf v1.5.0~136c15b (git: master 136c15b)
CentOS Linux release 7.3.1611 (Core)

(The instance profile has all access allowed on cloudwatch service)

Steps to reproduce:

  1. Bring up EC2 instance with instance profile that allows all cloudwatch operations.
  2. Build and configure telegraf to write metrics (mem, cpu disk etc.) to cloudwatch.
  3. Make sure the aws tokens are not available as environment variables or any configuration file for telegraf and run telegraf. it should fail when the plugin does GetSessionToken call.

Expected behavior:

Telegraf should make connection to the cloudwatch and write metrics to the namespace configured.

Actual behavior:

Exits with error message which says "Cannot use credentials to connect to AWS"

Additional info:

The actual connection to cloudwatch is indeed working with instance profile on ec2 servers. Only the validity check of the connection using "GetSessionToken" is failing, which causes the script to logically fail. If I deliberately bypass the error check and continues, it works as expected and the metrics are indeed posted to cloudwatch. The check should rely on something that might work with an instance profile on ec2 servers.

The previous version of telegraf, 1.4.4 is working fine with instance profiles as the "ListMetrics" call does work with instance profile.

@arohter
Copy link

arohter commented Nov 15, 2017

I believe use of http://docs.aws.amazon.com/STS/latest/APIReference/API_GetSessionToken.html is incompatible with IAM Instance Profile roles, so we can't use sts.GetSessionToken as a validation test when falling through to instance profile creds.

@danielnelson danielnelson added area/aws AWS plugins including cloudwatch, ecs, kinesis bug unexpected problem or unintended behavior labels Nov 15, 2017
@danielnelson danielnelson added this to the 1.5.0 milestone Dec 13, 2017
@danielnelson
Copy link
Contributor

I assume this was caused by the #3335.

@adamchainz Do you have any thoughts on how to solve?

Perhaps we should just remove this check and allow credential issues to be reported when we PutMetric.

@danielnelson danielnelson added the regression something that used to work, but is now broken label Dec 14, 2017
@adamchainz
Copy link
Contributor

I made a mistake. It's https://docs.aws.amazon.com/STS/latest/APIReference/API_GetCallerIdentity.html that should be called to check credentials are valid -it doesn't require any permissions afaik. I'll make a PR adding back in the check with this endpoint..

@danielnelson
Copy link
Contributor

danielnelson commented Dec 14, 2017

Thanks @adamchainz, yeah lets try to use this and we can do some testing to confirm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/aws AWS plugins including cloudwatch, ecs, kinesis bug unexpected problem or unintended behavior regression something that used to work, but is now broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants