-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
err="rpc error: code = Unavailable desc = transport is closing" #16073
Comments
@joshghent Thanks for raising this issue. |
@ewbankkit Thanks for getting back to me. No it doesn't have any stack trace. It appears to error because it cannot find the route53 zones and then the aws provider returns |
I am experiencing the same error, and I also have a module whose outputs are being fed into another module. |
I have this same error. Took down one of our environments and we can't bring it back up. I don't happen to use route53 so @joshghent's theory doesn't make sense for me personally. |
hello I don't know if I have the same problem, it seems that I do, here is the output I get
|
Because of this error - my Lambda is triggered twice instead of once :( Same issue with provider version 3.4.0. |
I see the same issue on
This is also on a M1 Mac, could be important |
Receiving the same error when when feeding a module that calls another module output from another module on provider 3.23.0.
My debug shows the plan completed before the error.
|
Getting the same issue with aws provider 3.24.1, (issue tested also with 3.23.0, 3.22.0, 3.21.0 and failing) and tf 0.12.29. The issue is inconsistent, it fails only for one remote state (stage in our case) and not for the others. Provider versions configuration are the same. It also fails at different places - I can see kubernetes provider failing, next time it is random provider... I've also deleted all non-default workspaces as suggested in linked issue but no change. Trace log shows just the following and I could find randomly failing providers with the rpc error in earlier part of the log.
Also no change whether I am using proxy or not. Is it possible that changes applied with the aws.provider 3.24.0 have somehow messed up the state somewhere? (regarding #17125 ) I can also see a lot of rpc errors of three kinds:
Terraform refresh is crashing around different resources than plan, however, both of them are crashing always around IAM policy attachments "aws_iam_role_policy_attachment". But it does not fail if I target these resources with plan. |
Our issue has been resolved. It was a data resource which was looking for a deprecated AMI. I am only wondering why I could not find the exact source of the issue in the TRACE log of Terraform. |
Hello, Version 3.26 Version 2.70
So it looks like a logging regression since user is now unable to find quickly what is the missing "data" resource. |
The lack of configuration file/line context when reporting data source errors is likely due to upgrading to Terraform Plugin SDK version 2 in version 3.2.0 of the Terraform AWS Provider. See also: hashicorp/terraform-plugin-sdk#561. That issue will require a fix upstream in the Terraform Plugin SDK. We could potentially improve actual error message for data sources though to at least give some context about what type of resource lookup failed. It might be good to fold these improvements into the discussion for #17314 -- data source updates could be done quicker than managed resources. Trying to solve the confusing |
@bflad I've tested the two versions of aws provider (3.26 & 2.70) using the following two version of Terraform CLI : 0.13.1 & 0.14.5. The behavior is the same. |
Confirmed with aws provider 3.25 & TF 0.14.5 seems like a logging regression as luhhujbb commented on. |
I have thumbs upped this, but wanted to add that I would commit a bounty of a couple hundreds dollars for someone who can fix this ASAP. I've been struggling for weeks with this problem. Been trying terraform 0.13 and 0.14 of various versions, and various versions of the aws provider. The problem for me is intermittent, where an "apply" just hangs for a very long time (~5m) while gathering facts and with trace logging on I just see a bunch of this rpc error transport is closing. I'm having to do terraform apply multiple times before it finally goes through, but meanwhile this is frustrating me to no end. This is impacting my daily life as I live and breathe Terraform. I would be available to try dev/beta builds as well and give feedback, as I've tried everything besides jumping into code at this point and I'm still frustrated and wasting tons of time. I suspect that not only a logging resolution would need to be made, but if the logging fix would give me more clarity on why my runs fail so regularly I could fix the actual problem whatever that may be. Cheers, hoping for a speedy resolution! |
For the record, I also had this behaviour when using a module that was searching for an AMI. I used it in my infrastructure 11 times and I guess it was too many for Terraform. I deactivated the module and no more errors. |
Can this be related to AWS upgrading minimum TLS version for some endpoints? https://aws.amazon.com/blogs/security/tls-1-2-required-for-aws-fips-endpoints/ |
I'm facing a similar issue here: Terraform versions I'm using:
Here's the last part of a TF_LOG file for a run where the error occurred:
|
I've also begun getting this issue on latest Terraform using the latest AWS plugin while running on Intel Mac. Only happening to some of my modules. As my modules depend on each other, the credentials, terminal session, env variables are identical. It happens across an assortment of AWS resources. I also find it interesting that the release and acquire lock steps succeed 100% of the time despite making DynamoDB calls that could in theory fail for the same reason. Found this possible hint as well:
I've also seen TLS transport errors across several runs, though my current runs while filling out this post will not oblige with examples. Despite this being a For example:
Since this is my deep dive into Terraform trace in a while, I'm not sure how much noise like this is expected. Put since it's only happening in some of my modules I'm lead to assume state is somehow factoring in. Lastly, this plan should include no changes to the resources themselves. The only changes have been to Terraform and AWS plugin versions. Rolling back to the versions used successfully does not seem to help. (0.14.5 and 3.25.0) I hope these additional details help the team. Happy to also be a resource to provide and troubleshoot on this issue. |
I encountered a similar error but it was related to this issue which is caused by a bug in the cloudflare provider version Pinning the cloudflare provider to |
I can confirm, I needed to update Terraform from 0.12 to 0.14, and I had to rollback the Cloudflare plugin from The error message was related to an AWS EC2 instance, but it really was the Cloudflare plugin rollback that fixed the issue. |
@cdimitroulas You are my hero mate. Hopefully this also addresses OP's concern. |
Wow. I've been fighting this issue for days, undeploying my modules, one by one, hoping to find the problematic one, and it was Cloudflare the whole time. Pinning to |
I got a similar error message when deploying a lambda function. In my case the error happened due to my lambda name is longer than 64 characters [1]. After shortening my lambda function name the error disappears. |
I just got this when it was trying to set an S3 bucket policy to one that was invalid because it had a zero length array of principals. I was only able to work that out by setting
|
@luhhujbb Thank you for this. Rolling back the AWS provider version to 2.70 gave me much better log messages. |
I seem to have the same issue using terraform And all this is happening after Azure Pipeline says
Edit: This error doesn't seem to break the |
Facing same issue on doing
I was following guide from https://learn.hashicorp.com/tutorials/terraform/cdktf-build-python?in=terraform/cdktf and it did not work for me. terraform: v1.0.9 |
Exactly same issue as @soumyadipDe |
Switched back to the CKDTF@0.6.4 the issue solved |
I am facing the same issue Terraform Version
When running a terraform plan, it outputs:
Fails on the following request
Error logs:
No modifications were made since the last successful apply. Update: trying to running plan from old working branch, works, and where the master branch fails, this passes with the following error:
|
Setting up this GODEBUG environment variable before launching terraform fixes it for me.
Running Terraform 0.14.11
|
I the feeling that this could be related to the instance class that is being used. Increased my instance class from t2.micro to t3.small |
Also export
|
Had this same issue today when I was trying to rename a CodeBuild project. Turns out the new name is just too long, solved the issue by using a shorter name. I know this might not be the cause of this issue for everyone, but just adding it here in case it'd help anyone. I'm using:
|
I am getting this same issue.
I'm not using cloudflare. 2022-05-31T22:02:03.798Z [DEBUG] opening new ssh session However, the app actually deploys correctly on both instances. I get the following, where you can see the execution on both instances was successful, then, instead of printing the outputs, it prints the null_resource local-exec error: null_resource.ExecuteAnsible[1]: Creation complete after 7m57s [id=7711336021038131401] |
I am also getting same error while testing my brand new terraform-provider with resource. |
I'm also getting errors that look like it could be plugin related. Has anyone solved this issue? The biggest problem is that its so intermittent. Is there a way to change timeouts here? Timeouts is the only reason I can think of which would cause this issue intermittently.
|
Hello all, for those of you encountering the error message In the Kubernetes case, the error is often associated with running out of memory. On M1 Macs, it appears that there is an issue in Rosetta 2 (the system that allows running If you are encountering this error on platforms other than Kubernetes or M1 Macs, please include the output of |
Hi @gdavison , I was encountering this on AWS CodeBuild. Like with Kubernetes, the issue was the build process was running out of memory. I increased the memory available and that resolved it. Thanks for getting back to me anyway. |
in my case it was migration from one mac to another using Apple's Migration Assistant. The Assistant copied over all my files, including the terraform project and I had to delete its .terraform folder, then run terraform init and it worked like a charm. |
I'm going to close this issue. Many of the older reports of this error have been resolved in the provider. I've created the pinned issue #27577 to highlight the M1 Mac and Kubernetes issues. If you are still seeing this error after updating to the latest version of the provider, please open a separate issue |
This problem presented to me out of the blue after working perfectly fine for a couple of hours and then... never worked again. Using tf cloud in different workspaces, some with no new changes. To be honest I was assuming that something was change from the aws side since started to failed after 5 PM GMT-3 on december first like something was deprecated... but could not found anything about it... Here are my logs and configuration if it helps:
|
@agustinlare : I would bet that you're running out of resources on whatever platform you're building on. Are you using Codebuild? If so, increase memory and CPU allocation and see if that helps. |
@arunAtCSGI that would be a safe bet, had that issue before with codebuild. No this time tho... the problem was with the EKS module and failed almost instantly (which is was differente when was a resources issuese cause end it up timing out). This morning started working like if nothing happen changing absolutely nothing. |
Ah I see. Thanks for posting back though. This error seems to have many issues related to it |
As @gdavison mentioned earlier, closing this issue as many of the issues it encompasses have been resolved. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Community Note
Terraform CLI and Terraform AWS Provider Version
Affected Resource(s)
Terraform Configuration Files
Won't post the whole thing but think this gets the point across
Debug Output
Panic Output
Expected Behavior
Actual Behavior
Steps to Reproduce
terraform plan -compact-warnings=false -input=false
Important Factoids
Nothing weird about the infra. You can see in the
main.tf
snippet, that I'm passing in an output from thevariables
module into theecs-api
module. This domain name is then used in adata
to find a route53 zone. Further up the debug output, there is an error saying it can't find the route53 zone. I believe this because the variables module output isn't getting correctly passed in. The ecs module should depend on the variables module but it appears to be getting executed before the variables module outputs.References
The text was updated successfully, but these errors were encountered: