Skip to content

WARN [router]: failed to dial to setup node: error="i/o deadline reached" #1968

@0pcom

Description

@0pcom

A recurring issue is that setup-nodes become unreachable:

[2025-04-13T08:59:22.655941308-05:00] WARN [router]: failed to dial to setup node: setupPK(0324579f003e6b4048bae2def4365e634d8e0e3054a20fc7af49daf2a179658557) error="i/o deadline reached"
[2025-04-13T08:59:42.96057214-05:00] WARN [router]: failed to dial to setup node: setupPK(024fbd3997d4260f731b01abcfce60b8967a6d4c6a11d1008812810ea1437ce438) error="i/o deadline reached"
[2025-04-13T09:00:06.245972172-05:00] WARN [router]: failed to dial to setup node: setupPK(03b87c282f6e9f70d97aeea90b07cf09864a235ef718725632d067873431dd1015) error="i/o deadline reached"

It's necessary to restart the setup-node in the instance that it becomes unreachable. Somertimes it becomes unreachable quickly.

We need a permanent solution for this issue.

I believe the transport setup-node api of the visor - which is accessible over dmsg - does not have the same issue. But we should confirm this is the case.

We should look at what connections over dmsg are not affected by this issue, and attempt to improve the logic for the affected connections.

The same underlying issue may be to blame for these other tickets:

#1942

error="i/o deadline reached"

#1936

error="i/o deadline reached"

#1803

error="i/o deadline reached"

#1804

error="i/o deadline reached"
  • As a temporary measure, we need a way that the setup-node can attempt to connect to itself periodically and do a health check, to ensure the connection works. If the connection fails after 3 attempts, the setup-node should either be shut down (and expected to be restarted by process control mechanism) or simply restart the connection to dmsg in the code at that point.

  • we need to implement the dmsghttp-config for the setup-node so that it doees not need to connect to the dmsg-discovery with plain http requests

Metadata

Metadata

Assignees

No one assigned

    Labels

    breakingissue breaks critical functionalitybugSomething isn't workingdeploymentproduction deployment issuedmsgissue of dmsg client, connection, or dmsghttp-configroutingsetup-nodeurgentThis issue should be done with the highest priority

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions