Skip to content

Instance creation failing during link creation (OPTE ioctl) #1679

Open
@smklein

Description

@smklein

When following the remote-access preview instructions, I've noticed flakiness creating instances with the terraform workflow.

I've been following the demo instructions on folgers, and when I get to the point of creating instances via terraform, I use: terraform init && terraform apply. For most instances, this seems to work, but I occasionally see one or two which fail with a "500 internal error".

Digging into the sled agent logs, I see the following:

[2022-09-07T15:34:00.571064026Z]  INFO: SledAgent/dropshot (SledAgent)/12362 on folgers: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::3]:57942)                                            
[2022-09-07T15:34:00.571380893Z]  INFO: SledAgent/InstanceManager/12362 on folgers: instance_ensure e7670ef9-73ec-4846-8058-f84a263e2ef9 -> InstanceRuntimeStateRequested { run_state: Running, migration_params: None }                      
[2022-09-07T15:34:00.571637528Z]  INFO: SledAgent/InstanceManager/12362 on folgers: new instance                                                                                                                                              
[2022-09-07T15:34:00.572034446Z]  INFO: SledAgent/InstanceManager/12362 on folgers: Instance::new w/initial HW: InstanceHardware { runtime: InstanceRuntimeState { run_state: Creating, sled_id: fb0f7546-4d46-40ca-9d56-cbb810684ca7, propoli
s_id: e685ef90-d155-4bcd-abf3-12531e6c1ef4, dst_propolis_id: None, propolis_addr: Some([fd00:1122:3344:101::e]:12400), migration_id: None, ncpus: InstanceCpuCount(4), memory: ByteCount(2147483648), hostname: "db-instance-1", ...
[2022-09-07T15:34:00.572289002Z]  INFO: SledAgent/dropshot (SledAgent)/12362 on folgers: request completed (req_id=4b7b08f2-c1b6-4574-810f-f540ee5715c9, uri=/instances/e7670ef9-73ec-4846-8058-f84a263e2ef9, method=PUT, remote_addr=[fd00:11
22:3344:101::3]:57942, local_addr=[fd00:1122:3344:101::1]:12345, error_message_external="Internal Server Error", response_code=500)
    error_message_internal: Error managing instances: Instance error: Failure interacting with the OPTE ioctl(2) interface: netadm failed dlmgmtd: link id creation failed: 17

I'm using Omicron @ 55cc15c

Metadata

Metadata

Assignees

Labels

networkingRelated to the networking.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions