You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
aws eks wait cluster-active may get rate-limited (TooManyRequestsException) and cause the bootstrap script to terminate, instead of falling back to the retry logic around aws eks describe-cluster.
What you expected to happen:
The describe-cluster call should be retried the desired number of times, despite rate-limiting errors.
The text was updated successfully, but these errors were encountered:
We're facing a similar issue where aws eks wait cluster-active fails due to a transient timeout with the AWS API and then a node gets stuck without joining the cluster (which has other knock-on effects, wedging cluster-autoscaler).
2024-03-20T15:00:46+0000 [eks-bootstrap] INFO: --b64-cluster-ca or --apiserver-endpoint is not defined, describing cluster...
Connect timeout on endpoint URL: "https://eks.us-west-2.amazonaws.com/clusters/eks-prod-us-west-2"
Exited with error on line 358
It seems like the patch in #1004 would fix our problem, but it appears it was closed after sitting for a long time.
The best thing to do here is to pass --apiserver-endpoint and --b64-cluster-ca and avoid the DescribeCluster call entirely. This fallback mechanism has been removed in our AL2023 AMI's.
(relayed from an internal ticket)
What happened:
aws eks wait cluster-active
may get rate-limited (TooManyRequestsException
) and cause the bootstrap script to terminate, instead of falling back to the retry logic aroundaws eks describe-cluster
.What you expected to happen:
The
describe-cluster
call should be retried the desired number of times, despite rate-limiting errors.The text was updated successfully, but these errors were encountered: