-
Notifications
You must be signed in to change notification settings - Fork 45
Adds network interface attach/detach #713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
adb2017
to
ad8e8b3
Compare
An apology is in order for the size of this PR :) I didn't quite do my homework, and intended to update Omicron to support: creating interfaces for an instance, at the time the instance is created; creating an interface, then creating an instance attached to that previously-created interface; and creating an interface, and attaching it to a running instance. That was overly ambitious, and couldn't possibly work, since hot-plug of interfaces isn't supported by either the sled agent or Propolis at this point. So this code updated some of Nexus to support all of those cases, but then reined in the scope to just the first. One can now create zero or more network interfaces for an instance, at the time the instance is created. The other paths will come in follow-up work, probably deferred for a while. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass - looks solid, and dang those are some fancy CTEs
@smklein @davepacheco This is ready for another review. I had previously convinced myself that it should be possible to create a network interface that is not attached to any particular instance. This is similar to how disks work, but it really doesn't make much sense for interfaces. A network interface is better thought of as a binding of an IP address to an instance. We want to add support for reserving particular IPs, not assigned to an instance, but these will be managed as a separate resource and table. The latest commit here removes the optional/nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I still need to take a look at subnet_allocation.rs but I wanted to post this first.
"network_interface_subnet_id_ip_key"; | ||
const NAME_CONFLICT_CONSTRAINT: &str = | ||
"network_interface_instance_id_name_key"; | ||
match e { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if any of this belongs in nexus/src/db/error.rs, and whether you can reuse any of the functions there. Mostly there's just some details about the structure of these errors that seems tricky and it'd be nice to not duplicate it. But I don't immediately see how to do that without refactoring those functions further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm happy to move this around.
I am also a bit leery of this, to be honest. These error messages are extremely coupled to the database, and it's very easy to get them wrong or forget to update them as we move indexes around. I've already done that once in the process of this PR: I moved a unique index on (vpc_id, name)
to one on (instance_id, name)
, and forgot to update the constraint name. I really wish there were a better way to do this, but maybe it just comes with the territory of trying to catch lots of different errors from the database in one query.
I think one way to refactor usefully might be to check that the database failed a request because it violated a given constraint. For example, we could define an enum with the error variants we want to catch, along with the name of the constraint (or other error condition) that should be derived from. Then we can have a from_pool
function which is basically this match expression, just looking for those specific strings and translating to the enum variant.
I have a bug-fix PR lined up where I do this whole song-and-dance again. I'll look at it through this lens, to see if a trait or other shared code would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved a unique index on (vpc_id, name) to one on (instance_id, name), and forgot to update the constraint name. I really wish there were a better way to do this, but maybe it just comes with the territory of trying to catch lots of different errors from the database in one query.
It'd be neat if Diesel knew about constraints the way it knows about tables, columns, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to hold off moving this around for this PR, but I'll prototype how to make this easier in the bug-fix PR I'm also working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've experimented with this for a bit. It can be made to work, but I'm really not sure it's worth it. Specifically, I don't think it reduces the complexity -- the solution I have in mind is definitely more opaque to me than the original. It does allow us to package the diesel::result::DatabaseErrorKind
, string we want to check for, and the "handler" for building our error type from the database error and the candidate record. But it's pretty complicated. I'm happy to show you offline, or post it here, whatever works. I'll mull it over some more tomorrow, and maybe add it to omicron_nexus::db::error
if it seems useful.
Just a note, there are now some conflicts with |
6e66818
to
b830a43
Compare
b830a43
to
b099b9b
Compare
- Add helpers to check if an address provided by a user can be requested, i.e., is in the subnet and is not a reserved address. - Add a query used to insert a network interface into the database, handling both IP allocation and validation of any provided instance (i.e., that the instance spans exactly one VPC). This subsumes the previous IP allocation query. - Adds parameters for setting up network interfaces (possibly multiple) in the instance create request, with parameters for each. - Adds more robust and clear database error handling for detecting failures when inserting network interfaces. This code detects specific error messages from the database that we intentionally insert, and makes clear the relationship between the errors we expect, and the invariants they are designed to ensure. This includes special handling of a duplicate primary key, which is only expected during sagas and helps ensure their idempotency. - Ensures idempotency of multi-NIC allocation and rollback during sagas, by adding separate ID-allocation nodes and NIC-creation nodes. The rollback for the latter is a no-op, and the former deletes all NICs for the instance to be created. There are also tests to ensure that we completely any NICs created as a result of failure partway through, and that replaying the saga works as expected. - Adds some more tests to the NIC creation Nexus API - Adds convenience method to delete all NICs from an instance.
requested, i.e., is in the subnet and is not a reserved address.
handling both IP allocation and validation of any provided instance
(i.e., that the instance spans exactly one VPC). This subsumes the
previous IP allocation query.
instance_create_network_interface
method, moving theinstance create saga to use the new
vpc_subnet_create_network_interface
method, with an instanceprovided in the construction request.
in the instance create request.
instance creation. These must be new interfaces, support for attaching
to existing interfaces is not yet implemented.