TQ: Support sled expunge via trust quorum pathway by andrewjstone · Pull Request #9765 · oxidecomputer/omicron

andrewjstone · 2026-01-31T00:20:14Z

This commit adds a 3 phase mechanism for sled expungement.

The first phase is to remove the sled from the latest trust quorum configuration via omdb. The second phase is to reboot the sled after polling for the commit of the configuration with the trust quorum removal. The third phase is to issue the existing omdb expunge command, which changes the sled policy as before.

The first and second phases remove the need to physically remove the sled before expungement. They act as a software mechanism that gates the sled-agent from restarting on the sled and doing work when it should be treated as "absent". We've discussed this numerous times in the update huddle and it is finally arriving!

The third phase is what informs reconfigurator that the sled is gone and remains the same except for an extra sanity check that the last committed trust quorum configuration does not contain the sled that is to be expunged.

The removed sled may be added back to this rack or another after being clean slated. I tested this by deleting the files in the internal "cluster" and "config" directories and rebooting the removed sled in a4x2 and it worked.

This PR is marked draft because it changes the current sled-expunge pathway to depend on real trust quorum. We cannot safely merge it in until the key-rotation work from #9737 is merged in.

This also builds on #9741 and should merge after that PR.

I tested this out by first trying to abort and watching it fail because there is no trust quorum configuration. Then I issued an LRTQ upgrade, which will fail because I didn't restart the sled-agents to pick up the LRTQ shares. Then I aborted that configuration stuck in prepare. Lastly, I successfully issued a new LRTQ upgrade after restartng the sled agents and watched it commit. Here's the external API calls: ``` ➜ oxide.rs git:(main) ✗ target/debug/oxide --profile recovery api '/v1/system/hardware/racks/ea7f612b-38ad-43b9-973c-5ce63ef0ddf6/membership/abort' --method POST error; status code: 404 Not Found { "error_code": "Not Found", "message": "No trust quorum configuration exists for this rack", "request_id": "819eb6ab-3f04-401c-af5f-663bb15fb029" } error ➜ oxide.rs git:(main) ✗ ➜ oxide.rs git:(main) ✗ target/debug/oxide --profile recovery api '/v1/system/hardware/racks/ea7f612b-38ad-43b9-973c-5ce63ef0ddf6/membership/abort' --method POST { "members": [ { "part_number": "913-0000019", "serial_number": "20000000" }, { "part_number": "913-0000019", "serial_number": "20000001" }, { "part_number": "913-0000019", "serial_number": "20000003" } ], "rack_id": "ea7f612b-38ad-43b9-973c-5ce63ef0ddf6", "state": "aborted", "time_aborted": "2026-01-29T01:54:02.590683Z", "time_committed": null, "time_created": "2026-01-29T01:37:07.476451Z", "unacknowledged_members": [ { "part_number": "913-0000019", "serial_number": "20000000" }, { "part_number": "913-0000019", "serial_number": "20000001" }, { "part_number": "913-0000019", "serial_number": "20000003" } ], "version": 2 } ``` Here's the omdb calls: ``` root@oxz_switch:~# omdb nexus trust-quorum lrtq-upgrade -w note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 Error: lrtq upgrade Caused by: Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "8503cd68-7ff4-4bf1-b358-0e70279c6347", "content-length": "124", "date": "Thu, 29 Jan 2026 01:37:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "8503cd68-7ff4-4bf1-b358-0e70279c6347" } root@oxz_switch:~# omdb nexus trust-quorum get-config ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 latest note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 TrustQuorumConfig { rack_id: ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 (rack), epoch: Epoch( 2, ), last_committed_epoch: None, state: PreparingLrtqUpgrade, threshold: Threshold( 2, ), commit_crash_tolerance: 0, coordinator: BaseboardId { part_number: "913-0000019", serial_number: "20000000", }, encrypted_rack_secrets: None, members: { BaseboardId { part_number: "913-0000019", serial_number: "20000000", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, BaseboardId { part_number: "913-0000019", serial_number: "20000001", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, BaseboardId { part_number: "913-0000019", serial_number: "20000003", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, }, time_created: 2026-01-29T01:37:07.476451Z, time_committing: None, time_committed: None, time_aborted: None, abort_reason: None, } root@oxz_switch:~# omdb nexus trust-quorum get-config ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 latest note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 TrustQuorumConfig { rack_id: ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 (rack), epoch: Epoch( 2, ), last_committed_epoch: None, state: Aborted, threshold: Threshold( 2, ), commit_crash_tolerance: 0, coordinator: BaseboardId { part_number: "913-0000019", serial_number: "20000000", }, encrypted_rack_secrets: None, members: { BaseboardId { part_number: "913-0000019", serial_number: "20000000", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, BaseboardId { part_number: "913-0000019", serial_number: "20000001", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, BaseboardId { part_number: "913-0000019", serial_number: "20000003", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, }, time_created: 2026-01-29T01:37:07.476451Z, time_committing: None, time_committed: None, time_aborted: Some( 2026-01-29T01:54:02.590683Z, ), abort_reason: Some( "Aborted via API request", ), } root@oxz_switch:~# omdb nexus trust-quorum lrtq-upgrade -w note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 Started LRTQ upgrade at epoch 3 root@oxz_switch:~# omdb nexus trust-quorum get-config ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 latest note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 TrustQuorumConfig { rack_id: ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 (rack), epoch: Epoch( 3, ), last_committed_epoch: None, state: PreparingLrtqUpgrade, threshold: Threshold( 2, ), commit_crash_tolerance: 0, coordinator: BaseboardId { part_number: "913-0000019", serial_number: "20000000", }, encrypted_rack_secrets: None, members: { BaseboardId { part_number: "913-0000019", serial_number: "20000000", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, BaseboardId { part_number: "913-0000019", serial_number: "20000001", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, BaseboardId { part_number: "913-0000019", serial_number: "20000003", }: TrustQuorumMemberData { state: Unacked, share_digest: None, time_prepared: None, time_committed: None, }, }, time_created: 2026-01-29T02:20:03.848507Z, time_committing: None, time_committed: None, time_aborted: None, abort_reason: None, } root@oxz_switch:~# omdb nexus trust-quorum get-config ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 latest note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 TrustQuorumConfig { rack_id: ea7f612b-38ad-43b9-973c-5ce63ef0ddf6 (rack), epoch: Epoch( 3, ), last_committed_epoch: None, state: Committed, threshold: Threshold( 2, ), commit_crash_tolerance: 0, coordinator: BaseboardId { part_number: "913-0000019", serial_number: "20000000", }, encrypted_rack_secrets: Some( EncryptedRackSecrets { salt: Salt( [ 143, 198, 3, 63, 136, 48, 212, 180, 101, 106, 50, 2, 251, 84, 234, 25, 46, 39, 139, 46, 29, 99, 252, 166, 76, 146, 78, 238, 28, 146, 191, 126, ], ), data: [ 167, 223, 29, 18, 50, 230, 103, 71, 159, 77, 118, 39, 173, 97, 16, 92, 27, 237, 125, 173, 53, 51, 96, 242, 203, 70, 36, 188, 200, 59, 251, 53, 126, 48, 182, 141, 216, 162, 240, 5, 4, 255, 145, 106, 97, 62, 91, 161, 51, 110, 220, 16, 132, 29, 147, 60, ], }, ), members: { BaseboardId { part_number: "913-0000019", serial_number: "20000000", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 13c0a6113e55963ed35b275e49df4c3f0b3221143ea674bb1bd5188f4dac84, ), time_prepared: Some( 2026-01-29T02:20:46.792674Z, ), time_committed: Some( 2026-01-29T02:21:49.503179Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000001", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 8557d74f678fa4e8278714d917f14befd88ed1411f27c57d641d4bf6c77f3b, ), time_prepared: Some( 2026-01-29T02:20:47.236089Z, ), time_committed: Some( 2026-01-29T02:21:49.503179Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000003", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: d61888c42a1b5e83adcb5ebe29d8c6c66dc586d451652e4e1a92befe41719cd, ), time_prepared: Some( 2026-01-29T02:20:46.809779Z, ), time_committed: Some( 2026-01-29T02:21:52.248351Z, ), }, }, time_created: 2026-01-29T02:20:03.848507Z, time_committing: Some( 2026-01-29T02:20:47.597276Z, ), time_committed: Some( 2026-01-29T02:21:52.263198Z, ), time_aborted: None, abort_reason: None, } ```

@davepacheco

After chatting with @davepacheco, I changed the authz checks in the datastore to do lookups with Rack scope. This fixed the test bug, but is only a shortcut. Trust quorum should have it's own authz object and I"m going to open an issue for that. Additionally, for methods that already took an authorized connection, I removed the unnecessary authz checks and opctx parameter.

This commit adds a 3 phase mechanism for sled expungement. The first phase is to remove the sled from the latest trust quorum configuration via omdb. The second phase is to reboot the sled after polling for commit the trust quorum removal. The third phase is to issue the existing omdb expunge command, which changes the sled policy as before. The first and second phases remove the need to physically remove the sled before expungement. They act as a software mechanism that gates the sled-agent from restarting on the sled and doing work when it should be treated as "absent". We've discussed this numerous times in the update huddle and it is finally arriving! The third phase is what informs reconfigurator that the sled is gone and remains the same except for an extra sanity check that that the last committed trust quorum configuration does not contain the sled that is to be expunged. The removed sled may be added back to this rack or another after being clean slated. I tested this by deleting the files in the internal "cluster" and "config" directories and rebooting the removed sled in a4x2 and it worked. This PR is marked draft because it changes the current sled-expunge pathway to depend on real trust quorum. We cannot safely merge it in until the key-rotation work from #9737 is merged in. This also builds on #9741 and should merge after that PR.

andrewjstone · 2026-01-31T00:21:27Z