-
Notifications
You must be signed in to change notification settings - Fork 11
In-place update for VPC firewall rules #432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sudomateo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks good to me. I just want to run down the desired flow quickly to make sure I have it right.
- oxidecomputer/omicron#8175 will allow
rulesto benullin the request body (viaomitemptyin the Go SDK) and then default to[]in the API handler. Right nownulldoesn't work because the omicron API doesn't allow that. - https://github.com/oxidecomputer/oxide.go does not need to be updated since the change in omicron doesn't change the output from the generation in the Go SDK.
- The Terraform code uses an empty slice (not nil!) for
ruleswhich will serialize tonullwhich is now accepted by point 1. The removal of requires replace allows the update method to be called.
Yep, but this change doesn't affect this PR per se. More like I noticed this resource was broken when working on this. It's broken since we made the refactoring PR.
Not really, no.
Yep! |
|
Instead of waiting for the latest version of omicron to be deployed for testing I updated the Go SDK in oxidecomputer/oxide.go#289 to set |
Add a test to check whether an `oxide_vpc_firewall_rules` resource can successfully remove all rules from a VPC during its `Update` call.
Support removing all VPC firewall rules through the `Update` call. This
change relies on the changes in the upstream Go SDK to serialize the VPC
firewall rules as `{"rules": []}` when they are removed from Terraform.
3162ad0 to
b458c03
Compare
| if rules == nil { | ||
| return body | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer needed because the range below will not run when nil.
Update the lint tool versions to be compatible with Go 1.24.
95e7847 to
ce564af
Compare
b76985d to
c3585ce
Compare
|
Oh CI/CD, why do you do this to me? The version of Go used to install When I run |
819f105 to
9a220bc
Compare
|
Getting closer! I can see Go 1.23.9 is being used to compile |
|
Running |
9a220bc to
d6de320
Compare
d6de320 to
0fcdddd
Compare
|
The CI/CD issue occurred because the step runs |
| run: make test | ||
| - name: lint | ||
| run: sudo make lint | ||
| run: make lint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@karencfv I know you're the author of this so I can't add you as a reviewer but I pushed a few commits to get this over the line. Can you give this a once over when you have a chance? Here's the test output. |
karencfv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good thanks! That CI issue looks annoying to debug, glad you got to the bottom of it! I don't mind removing that sudo. That code predates me, so not entirely sure why it was sudo to begin with.
It appears I don't have the ability to approve this PR because I opened it. Do you mind approving it?
|
Ah! just saw you approved it. You know what? It's missing a changelog entry, one sec let me get that in |
Added acceptance tests to surface the issue reported in #453. Updated the schema for `rules` from `schema.SetNestedAttribute` to `schema.ListNestedAttribute` to better work with how the Oxide API treats VPC firewall rules. See below for the reason why this was required. In #432 we removed the requires replace directive from the `vpc_firewall_rules` resource to support in-place modification. The intent was to prevent unnecessary network connectivity issues while VPC firewall rules are deleted and created anew (e.g., when the delete succeeds and the create fails). With this change, `Update` is now called rather than `Delete` followed by `Create`. However, this change was also a bit of a workaround since the Oxide API does not support in-place updates of VPC firewall rules, at least not in the manner than Terraform expects. Instead, when you replace VPC firewall rules using the Oxide API each rule will be given a unique ID. ``` > curl --silent --request PUT --header "Authorization: Bearer $OXIDE_TOKEN" --header "Content-Type: application/json" $OXIDE_HOST/v1/vpc-firewall-rules?project=matthewsanabria&vpc=example --data @/tmp/rules.json {"rules":[{"id":"0852c8c8-e1d4-4dd4-84a7-19946eae0313","name":"allow-https","description":"Allow HTTPS.","time_created":"2025-06-20T19:40:22.514499Z","time_modified":"2025-06-20T19:40:22.514499Z","status":"enabled","direction":"inbound","targets":[{"type":"subnet","value":"example"}],"filters":{"hosts":[{"type":"vpc","value":"example"}],"protocols":["TCP"],"ports":["443"]},"action":"allow","priority":50,"vpc_id":"b7b7b444-1c64-4cb5-82d0-f1e4404a100c"},{"id":"bac0f44f-1835-4943-9944-08e6c328bbea","name":"allow-ssh","description":"Allow SSH (new).","time_created":"2025-06-20T19:40:22.514501Z","time_modified":"2025-06-20T19:40:22.514501Z","status":"enabled","direction":"inbound","targets":[{"type":"subnet","value":"example"}],"filters":{"hosts":[{"type":"vpc","value":"example"}],"protocols":["TCP"],"ports":["22"]},"action":"allow","priority":50,"vpc_id":"b7b7b444-1c64-4cb5-82d0-f1e4404a100c"}]} > curl --silent --request GET --header "Authorization: Bearer $OXIDE_TOKEN" --header "Content-Type: application/json" $OXIDE_HOST/v1/vpc-firewall-rules?project=matthewsanabria&vpc=example | jq -r '.rules[] | pick(.id, .name)' { "id": "0852c8c8-e1d4-4dd4-84a7-19946eae0313", "name": "allow-https" } { "id": "bac0f44f-1835-4943-9944-08e6c328bbea", "name": "allow-ssh" } ``` Even if you issue another `PUT` request with the exact same request body Oxide will still generate new IDs for the rules. ``` > curl --silent --request PUT --header "Authorization: Bearer $OXIDE_TOKEN" --header "Content-Type: application/json" $OXIDE_HOST/v1/vpc-firewall-rules?project=matthewsanabria&vpc=example --data @/tmp/rules.json {"rules":[{"id":"74658b3c-f3f9-4e36-985c-f9259c81c98a","name":"allow-https","description":"Allow HTTPS.","time_created":"2025-06-20T19:40:47.171866Z","time_modified":"2025-06-20T19:40:47.171866Z","status":"enabled","direction":"inbound","targets":[{"type":"subnet","value":"example"}],"filters":{"hosts":[{"type":"vpc","value":"example"}],"protocols":["TCP"],"ports":["443"]},"action":"allow","priority":50,"vpc_id":"b7b7b444-1c64-4cb5-82d0-f1e4404a100c"},{"id":"91b353f1-8025-4c89-9d28-760c2c5719dd","name":"allow-ssh","description":"Allow SSH (new).","time_created":"2025-06-20T19:40:47.171868Z","time_modified":"2025-06-20T19:40:47.171868Z","status":"enabled","direction":"inbound","targets":[{"type":"subnet","value":"example"}],"filters":{"hosts":[{"type":"vpc","value":"example"}],"protocols":["TCP"],"ports":["22"]},"action":"allow","priority":50,"vpc_id":"b7b7b444-1c64-4cb5-82d0-f1e4404a100c"}]} > curl --silent --request GET --header "Authorization: Bearer $OXIDE_TOKEN" --header "Content-Type: application/json" $OXIDE_HOST/v1/vpc-firewall-rules?project=matthewsanabria&vpc=example | jq -r '.rules[] | pick(.id, .name)' { "id": "74658b3c-f3f9-4e36-985c-f9259c81c98a", "name": "allow-https" } { "id": "91b353f1-8025-4c89-9d28-760c2c5719dd", "name": "allow-ssh" } ``` Let's walk through why this is a problem for Terraform. I used Terraform to create two VPC firewall rules as shown below. ``` > jq -r '.resources[1].instances[].attributes.rules[] | pick(.id, .name)' terraform.tfstate { "id": "12bbb9ca-3360-4f9f-ad5a-ada3cbb736c4", "name": "allow-https" } { "id": "5d83771f-76fd-421f-a785-3726b6310389", "name": "allow-ssh" } ``` I then updated the port for the `allow-ssh` rule from 22 to 23. The Terraform plan shows that the resource will be updated in place by deleting the `allow-ssh` rule with ID `5d83771f-76fd-421f-a785-3726b6310389` and creating a new one. It will not touch the `allow-https` rule at all. ``` > terraform apply oxide_vpc.example: Refreshing state... [id=b7b7b444-1c64-4cb5-82d0-f1e4404a100c] oxide_vpc_firewall_rules.example: Refreshing state... [id=c75d1e61-9737-4abd-be16-8837920e2344] Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place Terraform will perform the following actions: # oxide_vpc_firewall_rules.example will be updated in-place ~ resource "oxide_vpc_firewall_rules" "example" { ~ id = "c75d1e61-9737-4abd-be16-8837920e2344" -> (known after apply) ~ rules = [ - { - action = "allow" -> null - description = "Allow SSH." -> null - direction = "inbound" -> null - filters = { - hosts = [ - { - type = "vpc" -> null - value = "example" -> null }, ] -> null - ports = [ - "22", ] -> null - protocols = [ - "TCP", ] -> null } -> null - id = "5d83771f-76fd-421f-a785-3726b6310389" -> null - name = "allow-ssh" -> null - priority = 50 -> null - status = "enabled" -> null - targets = [ - { - type = "subnet" -> null - value = "example" -> null }, ] -> null - time_created = "2025-06-20 20:30:10.992084 +0000 UTC" -> null - time_modified = "2025-06-20 20:30:10.992084 +0000 UTC" -> null }, + { + action = "allow" + description = "Allow SSH." + direction = "inbound" + filters = { + hosts = [ + { + type = "vpc" + value = "example" }, ] + ports = [ + "23", ] + protocols = [ + "TCP", ] } + id = (known after apply) + name = "allow-ssh" + priority = 50 + status = "enabled" + targets = [ + { + type = "subnet" + value = "example" }, ] + time_created = (known after apply) + time_modified = (known after apply) }, # (1 unchanged element hidden) ] # (1 unchanged attribute hidden) } Plan: 0 to add, 1 to change, 0 to destroy. ``` Applying this plan causes the error noted in this issue. Looking closer at the error text we see Terraform is referring to the `allow-https` rule with ID `12bbb9ca-3360-4f9f-ad5a-ada3cbb736c4`, not the `allow-ssh` rule. ``` Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes oxide_vpc_firewall_rules.example: Modifying... [id=c75d1e61-9737-4abd-be16-8837920e2344] ╷ │ Error: Provider produced inconsistent result after apply │ │ When applying changes to oxide_vpc_firewall_rules.example, provider "provider[\"registry.terraform.io/oxidecomputer/oxide\"]" │ produced an unexpected new value: .rules: planned set element │ cty.ObjectVal(map[string]cty.Value{"action":cty.StringVal("allow"), "description":cty.StringVal("Allow HTTPS."), │ "direction":cty.StringVal("inbound"), │ "filters":cty.ObjectVal(map[string]cty.Value{"hosts":cty.SetVal([]cty.Value{cty.ObjectVal(map[string]cty.Value{"type":cty.StringVal("vpc"), │ "value":cty.StringVal("example")})}), "ports":cty.SetVal([]cty.Value{cty.StringVal("443")}), │ "protocols":cty.SetVal([]cty.Value{cty.StringVal("TCP")})}), "id":cty.StringVal("12bbb9ca-3360-4f9f-ad5a-ada3cbb736c4"), │ "name":cty.StringVal("allow-https"), "priority":cty.NumberIntVal(50), "status":cty.StringVal("enabled"), │ "targets":cty.SetVal([]cty.Value{cty.ObjectVal(map[string]cty.Value{"type":cty.StringVal("subnet"), │ "value":cty.StringVal("example")})}), "time_created":cty.StringVal("2025-06-20 20:07:00.207241 +0000 UTC"), │ "time_modified":cty.StringVal("2025-06-20 20:07:00.207241 +0000 UTC")}) does not correlate with any element in actual. │ │ This is a bug in the provider, which should be reported in the provider's own issue tracker. ``` Why does Terraform care about the `allow-https` rule here at all if the plan was never meant to touch it? It's because before the plan Terraform had the following in its state. ``` > jq -r '.resources[1].instances[].attributes.rules[] | pick(.id, .name)' terraform.tfstate { "id": "12bbb9ca-3360-4f9f-ad5a-ada3cbb736c4", "name": "allow-https" } { "id": "9c723c19-16ce-489e-9c6d-23daf2ebeb1c", "name": "allow-ssh" } ``` Terraform expected after the plan was applied the state would look like this. The `allow-https` rule would remain untouched and the `allow-ssh` rule would be completely new with a new ID. ``` > jq -r '.resources[1].instances[].attributes.rules[] | pick(.id, .name)' terraform.tfstate { "id": "12bbb9ca-3360-4f9f-ad5a-ada3cbb736c4", "name": "allow-https" } { "id": "935e19c0-9036-419b-b615-d68665aed048", "name": "allow-ssh" } ``` Instead, the Oxide API generated new IDs for all the rules, so Terraform got back the following state. ``` > jq -r '.resources[1].instances[].attributes.rules[] | pick(.id, .name)' terraform.tfstate { "id": "32f2e026-f258-4512-bd68-79b613a8b5de", "name": "allow-https" } { "id": "935e19c0-9036-419b-b615-d68665aed048", "name": "allow-ssh" } ``` Now Terraform was left wondering what happened to the original `allow-https` rule with ID `12bbb9ca-3360-4f9f-ad5a-ada3cbb736c4`. Since Terraform couldn't find that original rule from the plan in the new state it returned an error.
Taking a look at the resource docs they already say "Firewall rules defined by this resource are considered exhaustive and will overwrite any other firewall rules for the VPC once applied."
Not sure if this needs more clarification? WDYT @sudomateo?
Waiting on an API change to fully run acceptance tests
Fixes: #431