Description
The NextExternalIp query should be idempotent to be usable in the context of sagas - if an IP address is allocated with the same UUID, the same underlying object should be returned.
This mostly works already - we even have a test to check for idempotency, by re-invoking an IP provision with the same UUID:
omicron/nexus/src/db/queries/external_ip.rs
Line 975 in 1c55cc9
However, there's an edge case I noticed where idempotency breaks down: If the IP pool has no free addresses, idempotent IP provisioning fails.
The gist of the CTE used for insertion is:
WITH
next_external_ip as (...) -- Create a suitable record to be inserted, by finding an IP address from a free pool
external_ip as (...) -- Perform the INSERTION, with an ON CONFLICT clause that always returns something
updated_pool as (...) -- Update the parent object
SELECT * FROM external_ip
I believe the underlying cause is this part of the "next IP address provision" CTE; in particular, part of calculating the next_external_ip
:
omicron/nexus/src/db/queries/external_ip.rs
Lines 124 to 125 in 1c55cc9
The only IPs we would consider allocating are from cases where:
- We consider an IP from the pool
- For that candidate IP, no matching row exists in instance_external_ip.
This is our way of "finding a free address within a pool" - look for "all possible values", LEFT OUTER JOIN
that with "everything that's currently allocated", and find the choices where the ip IS NULL
-- in other words, where no row from the instance_external_ip
table matches.
This means that, in the past, the ON CONFLICT
clause in the INSERT
clause actually avoided inserting a row with the same UUID, but instead found a different (free) IP address.
If we instead update the WHERE
clause on line 124-125 to:
WHERE
(ip IS NULL) OR (id = <ip_id>)
Then we would look for addresses that are either free or have previously been allocated. This would correctly implement idempotency, even for this edge case.