-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create New Resource Group: Status=404 Code="ResourceGroupNotFound" #18268
Comments
@pkirch It seems like a dependency issue. Add a depends on to the resources the resource group. I hope this helps. |
I am currently having the same issue with azure resource groups. I get maybe one or two of these message from terraform output: Before I get the same 404 error as above:
The annoying thing is tho, is that if I check the subscription that resource group does actually exist, so it does get created. It also doesn't happen repeatably, we have a variety of different terraform projects each creating there own resource group(s), and this failure seems to happen arbitrarily for different projects. |
I am also experienceing this issue faily consistently with azurerm 3.99 (also tried various versions from 3.70 onward) A debug of the deployment shows that the resource is created. The final call to the API for validation responds with not found. Subsequent apply says that the resource exists and must be imported into state. Not Working Deployment 2024-04-17T17:22:21.476Z [DEBUG] provider.terraform-provider-azurerm_v3.70.0_x5: AzureRM Request: 2024-04-17T17:22:21.615Z [DEBUG] provider.terraform-provider-azurerm_v3.70.0_x5: AzureRM Request: Our code deployed 12 resource groups via module calls. Once or two of the RGs fail consistently. It is not consistent which RGs fail. We have about a 10% success rate on apply. This code worked consistently a month ago. |
I'm encountering a similar issue with azurerm version 3.95. Like you, I've tried various versions from 3.70 onward with no success. During deployment, the resource is created but the final API call for validation consistently returns 'not found,' resulting in errors and failure to write to state. We're deploying multiple resource groups via module calls, and about 10% of the time, a few of these RGs consistently fail. Strangely, the failing RGs vary with each deployment. This behavior is inconsistent with what we experienced a month ago when the code worked reliably. I'd appreciate any insights or suggestions on how to resolve this issue. Thanks!" |
I am using azurerm 3.83.0 and we started getting this a couple days ago. Our build servers run ubuntu and our builds that create resource groups are not having much success. Sometimes the resource group is created and sometimes it isn't but we are pretty consistently getting the NotFound error either way. When running locally on windows, my latest configuration worked fine the first time. I am not sure if there is any correlation here though between OSs yet, not enough data. I am thinking about opening a ticket with Microsoft with so many provider versions being affected did one of their APIs change? |
We’re having this issue as well. When creating multiple resource groups in parallel, inconsistently the creation of this or that resource group fail. The error message is 404 resource group could not be found. We have tried provider version 3.54.0 to 3.99.0 and the issue persists. The same code was working about two weeks ago. We started experiencing this issue since last two weeks. |
We are also having it while creating multiple resource groups in parallel. Tested provider version 3.100 and issue persists. |
I enabled trace logging and was seeing some weird behavior. I was on terraform version 1.3.6. After updating to 1.8.1 I have not seen this issue again. |
Been getting this error for over a week now. Tried upgrading terraform to 1.8.1 Still having issues with resource group creation. Getting Resource group still gets created but doesn't get saved into state. To me it seems like the same issue but different error message. |
Hi @katbyte, |
happening for us too - have had to disable azure in or dev environment due to this issue |
Been having the exact same problem here inconsistently for the past couple of weeks, always using the latest AzureRM provider version available at the time. We have a suite of tests that runs in CI in Azure DevOps which will execute a bunch of tests (terraform init/plan/apply/destroy) when a PR is created and it contains a modification to one of our modules. This sometimes means that we may have 30+ tests that will be queued in our ADO pipeline. I just launch a test suite, and with 8 parallel jobs running (1 job == terraform init/plan/apply/destroy performed on one test suite), I have 5 that failed immediately after trying to create the RG with the exact same issue. I now have 8 currently-running tests that managed to create the resource group just fine and proceeded with the rest. |
This sometimes also happened to me whilst running the identical TF code over and over. |
Here's and update I got from MSFT support.
|
@favoretti That response is in accurate because in our case we're working with ONLY one region and noticing the error |
ARM loadbalancer will send you places. It's not related to the region where you are creating the resources. |
@zoelfakar1 same here. Our use case for deploying tests/examples deploys everything in a single region, in a single subscription. The very first and essentially only pre-step of deploying our tests/examples generates a 4-character random string in Terraform and then creates the resource group. Once the resource group is created using a So in our case, we're not even trying to manage/create anything other than the resource group, and it still fails at least 20% of the time, whether we're running one or more tests/examples at a time in our CI pipeline. |
Same for us. We are only deploying to Canada Central. |
We have also noticied a delay in the resource group creation in the azure portal (i.e it gets created after the terraform apply finishes/errors out). Which makes sense as to why the final API call for validation returns '""Resource group 'XXX-XXX' could not be found."" (as it did not exisit then) |
+1, same issue. |
I'm working on a "fix" for this. So far it seems to work, I'm going to run an overnight test for this, after which we can discuss merging it upstream. |
To address comments that are referring to "deployments to a single region". ARM API itself is multi-region. Each request that provider sends to the API can potentially create a new HTTP session, which means session consistency on ARM backend won't help. Provider, as a consequence, will error out. Subsequent TF run will give you an error that resource group already exists and requires import, most likely because it takes just a couple more seconds for the data to be reconciled across azure backend databases. The kludge I added will just retry Hope this helps clarify the issue and attempted workaround. |
I suspect there has been a change to the ARM service. From my own experience and others who have commented on this issue, the recent problem started appearing on 12th of April. I agree it’s probably an eventual consistency problem. I wrote a python script which loops through creating, reading and deleting resource groups, printing the response headers and whenever there is a 404 returned for Get resource group the Upon discovering this I opened a support ticket with MS. This might be an intended change and the azurerm provider will need to adapt, the provider code has not been touched for a long time and the API version is fixed to a deprecated version of the Go SDK. |
Azure support got back to me and said the product group made a fix to over the weekend. |
They might, however that's not the first time this issue resurfaces unfortunately. Also, my contacts reported nothing about a fix yet :) |
Started seeing this again yesterday on the 3.100.0 Azure Terraform provider. |
Try 3.102.0 please - that's where my workaround got merged. Would be interested in hearing if it helps. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Is there an existing issue for this?
Community Note
Terraform Version
1.2.5
AzureRM Provider Version
3.15.1
Affected Resource(s)/Data Source(s)
azurerm_resource_group
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
New Azure resource group should be created reliable without error and should exist after creation.
Actual Behaviour
Deployment stopped with error mentioned.
Error occurred sporadically. Our logs show 12 failures in 662 runs.
Failures happend only in a certain time windows from 2022-07-27 02:57 p.m. to 2022-07-28 09:49 a.m.
I expect this issue is hard to troubleshoot from the data given. However, we hope filing this issue helps others in case it happens sporadically again.
As the failures happend already a few weeks ago, Terraform version and AzureRM Provider version are stated as used when the errors occurred.
Steps to Reproduce
We have a GitHub action workflow executing the following commands. (complete file)
Important Factoids
No response
References
Issues who seem similar, however, closed and/or fixed a long time ago.
The text was updated successfully, but these errors were encountered: