Skip to content

[sled-agent] ECONNREFUSED connecting Propolis server that is actually listening #1120

Open

Description

Running an instance create-stop-delete loop to test #1115, I ran into what appears to be another unexpected failure mode. Nexus assigns an address for the Propolis server to listen on, and the sled-agent dutifully starts up the Propolis zone and contained SMF service listening on that. The sled agent ensures that the Propolis SMF instance comes online, and then verifies that the HTTP server is online by just pinging it. That ping failed, with ECONNREFUSED, despite the Propolis server apparently listening on that address.

Here's the Propolis SMF log:

bnaecker@feldspar : ~/omicron $ pfexec zlogin $(zoneadm list | tail -1) 'tail +0 -f $(svcs -L propolis-server)'
[ May 26 10:55:27 Enabled. ]
[ May 26 10:55:27 Executing start method ("ctrun -l child -o noorphan,regent /opt/oxide/propolis-server/bin/propolis-server run /var/svc/manifest/site/propolis-server/config.toml [fd00:1122:3344:101::24]:12400 &"). ]
[ May 26 10:55:27 Method "start" exited with status 0. ]
May 26 17:55:28.937 INFO Starting server...
May 26 17:55:28.937 INFO listening, local_addr: [fd00:1122:3344:101::24]:12400
bnaecker@feldspar : ~/omicron $

So the HTTP server seems to be listening on [fd00:1122:3344:101::24]:12400. However, the sled agent log shows:

{"msg":"Installing Omicron zone: oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:21.878590652Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"Zone booting","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:24.214507089Z","hostname":"feldspar","pid":930,"zone":"oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::24, prefix: 64 }))","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:26.238671328Z","hostname":"feldspar","pid":930,"zone":"oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"Created address fd00:1122:3344:101::24/64 for zone: oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:26.784988479Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"Started propolis in zone: oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:27.207603652Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"GET request to http://[fd00:1122:3344:101::24]:12400/instance","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:27.954720099Z","hostname":"feldspar","pid":930,"propolis_client address":"[fd00:1122:3344:101::24]:12400","component":"propolis-client","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"failed to await http server (Request failed: error sending request for url (http://[fd00:1122:3344:101::24]:12400/instance): error trying to connect: tcp connect error: Connection refused (os error 146)), will retry in 325.090315ms","v":0,"name":"SledAgent","level":40,"time":"2022-05-26T17:55:27.955544069Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager","error":"Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Ipv6(fd00:1122:3344:101::24)), port: Some(12400), path: \"/instance\", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 146, kind: ConnectionRefused, message: \"Connection refused\" })) })"}
{"msg":"GET request to http://[fd00:1122:3344:101::24]:12400/instance","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:28.281388066Z","hostname":"feldspar","pid":930,"propolis_client address":"[fd00:1122:3344:101::24]:12400","component":"propolis-client","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"failed to await http server (Request failed: error sending request for url (http://[fd00:1122:3344:101::24]:12400/instance): error trying to connect: tcp connect error: Connection refused (os error 146)), will retry in 358.344067ms","v":0,"name":"SledAgent","level":40,"time":"2022-05-26T17:55:28.281923763Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager","error":"Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Ipv6(fd00:1122:3344:101::24)), port: Some(12400), path: \"/instance\", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 146, kind: ConnectionRefused, message: \"Connection refused\" })) })"}
{"msg":"GET request to http://[fd00:1122:3344:101::24]:12400/instance","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:55:28.641029504Z","hostname":"feldspar","pid":930,"propolis_client address":"[fd00:1122:3344:101::24]:12400","component":"propolis-client","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"failed to await http server (Request failed: error sending request for url (http://[fd00:1122:3344:101::24]:12400/instance): error trying to connect: tcp connect error: Connection refused (os error 146)), will retry in 1.074760533s","v":0,"name":"SledAgent","level":40,"time":"2022-05-26T17:55:28.641599392Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager","error":"Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Ipv6(fd00:1122:3344:101::24)), port: Some(12400), path: \"/instance\", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 146, kind: ConnectionRefused, message: \"Connection refused\" })) })"}
...
{"msg":"failed to await http server (Request failed: error sending request for url (http://[fd00:1122:3344:101::24]:12400/instance): error trying to connect: tcp connect error: Connection refused (os error 146)), will retry in 45.772109455s","v":0,"name":"SledAgent","level":40,"time":"2022-05-26T17:55:57.622658162Z","hostname":"feldspar","pid":930,"instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager","error":"Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Ipv6(fd00:1122:3344:101::24)), port: Some(12400), path: \"/instance\", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 146, kind: ConnectionRefused, message: \"Connection refused\" })) })"}
{"msg":"halt_and_remove_logged: Previous zone state: Running","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:56:23.786718536Z","hostname":"feldspar","pid":930,"zone":"oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}
{"msg":"Stopped and uninstalled zone","v":0,"name":"SledAgent","level":30,"time":"2022-05-26T17:56:23.789002326Z","hostname":"feldspar","pid":930,"zone":"oxz_propolis-server_61f5d4b2-bfff-4e70-a691-8644e52abcdd","instance_id":"d3201f81-d03b-42c3-af78-9632f50d60be","component":"InstanceManager"}

That is the sled agent:

  1. Starting the zone
  2. Creating the IP address fd00:1122:3344:101::24 for the Propolis HTTP server
  3. Verifying that the Propolis SMF instance is now online
  4. Making a request to http://[fd00:1122:3344:101::24]:12400/instance in a retry-loop, until it fails.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Sled AgentRelated to the Per-Sled Configuration and ManagementvirtualizationPropolis Integration & VM Management

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions