-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Randomly fails to create google_sql_database_instance due to timing issue #13091
Comments
Hey @mogronalol! Thanks for reporting the bug. I've been chasing this for the past few weeks (#12436 was written specifically to help track this down) so it's always good to have an extra datapoint. I'm pretty sure this is an upstream bug, and I'll work with Google to get it addressed. (Opening a bug on https://issuetracker.google.com is on my to-do list.) If you don't mind, I'd love to know how many SQL instances are defined in your config. (Does it reliably work on the config you provided? Or only sometimes?) I have a hunch that the error is more likely to occur the more instances you define in your config, but can't corroborate that yet. Definitely interested in getting this resolved, either upstream, through a workaround on our end, or both. |
I only have a single instance defined. One thing I've noticed since raising this is that if I manually create the project my template works fine. But, if the project is created with Terraform, I get this weird race condition. I know that sounds strange but I am wondering if project generation with the upstream API is producing some sort of invalid project state. It's just strange because I spent a good part of today implementing a fix by retrying on a 404, but that unfortunately did not work. It's really painful as these timing bugs are always hard to diagnose. Adding print statements to try and figure it out was impossible because a failure wasn't deterministic. |
Thanks for the extra info. I've filed this as issue 36656107 in the Google issue tracker. I'll keep an eye on it, and see if they can suggest any workarounds. :) Sorry for the trouble! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Hi there,
Terraform Version
Terraform v0.9.1
Affected Resource(s)
Terraform Configuration Files
Debug Output
https://gist.github.com/mogronalol/c8c308e17b390023939d85ba9c5853a3
Expected Behavior
Successfully retrieve the instance creation operation, then block until it is complete.
Actual Behavior
A
404
because it does aGET
too quickly, so the operation does not exist yet.terraform/builtin/providers/google/resource_sql_database_instance.go
Line 570 in bfdeae0
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
terraform apply
This does not always fail. It will pretty much never fail if you add a sleep before
terraform/builtin/providers/google/resource_sql_database_instance.go
Line 570 in bfdeae0
What does not make sense, is that if I change
SqlAdminOperationWater.RefreshFunction
to retry on a404
instead of failing, it will still be a404
after five attempts. But, a simple sleep before callingRefreshFunc
for the first time means that there is no404
so I'm a little stumped.The text was updated successfully, but these errors were encountered: