Skip to content

HWPSupervisor crashes on startup when using synaccess for driver-iboot-id #691

@BrianJKoopman

Description

@BrianJKoopman

On satp3 after maintenance they're having trouble starting up the HWP Supervisor agent, it crashes immediately with:

Args: ['--instance-id', 'hwp-supervisor', '--site-hub', 'ws://127.0.0.1:8005/ws', '--site-http', 'http://127.0.0.1:8005/call']
Installed OCS Plugins: ['socs', 'ocs']
Renaming this process to: "ocs-agent:hwp-supervisor"
2024-06-11T20:37:41+0000 Using OCS version 0.11.0
2024-06-11T20:37:41+0000 ocs: starting <class 'ocs.ocs_agent.OCSAgent'> @ satp3.hwp-supervisor
2024-06-11T20:37:41+0000 log_file is apparently None
2024-06-11T20:37:41+0000 Setting state: ControlState.Idle()
2024-06-11T20:37:41+0000 transport connected
2024-06-11T20:37:41+0000 session joined: {'authextra': {'x_cb_node': '4e2f32e9b379-6',
               'x_cb_peer': 'tcp4:172.19.0.1:53870',
               'x_cb_pid': 13,
               'x_cb_worker': 'worker001'},
 'authid': 'XGSA-KE73-GR6Y-QNXJ-JNFL-WL4V',
 'authmethod': 'anonymous',
 'authprovider': 'static',
 'authrole': 'iocs_agent',
 'realm': 'test_realm',
 'resumable': False,
 'resume_token': None,
 'resumed': False,
 'serializer': 'cbor.batched',
 'session': 3582372078905339,
 'transport': {'channel_framing': 'websocket',
               'channel_id': {},
               'channel_serializer': None,
               'channel_type': 'tcp',
               'http_cbtid': None,
               'http_headers_received': None,
               'http_headers_sent': None,
               'is_secure': False,
               'is_server': False,
               'own': None,
               'own_fd': -1,
               'own_pid': 8,
               'own_tid': 8,
               'peer': 'tcp4:127.0.0.1:8005',
               'peer_cert': None,
               'websocket_extensions_in_use': None,
               'websocket_protocol': None}}
2024-06-11T20:37:41+0000 startup-op: launching monitor
2024-06-11T20:37:41+0000 start called for monitor
2024-06-11T20:37:41+0000 monitor:0 Status is now "starting".
2024-06-11T20:37:41+0000 startup-op: launching spin_control
2024-06-11T20:37:41+0000 start called for spin_control
2024-06-11T20:37:41+0000 spin_control:1 Status is now "starting".
2024-06-11T20:37:41+0000 monitor:0 Status is now "running".
2024-06-11T20:37:41+0000 spin_control:1 Status is now "running".
2024-06-11T20:37:44+0000 monitor:0 CRASH: [Failure instance: Traceback: <class 'KeyError'>: 'outletStatus_4'
/usr/lib/python3.8/threading.py:932:_bootstrap_inner
/usr/lib/python3.8/threading.py:870:run
/usr/local/lib/python3.8/dist-packages/twisted/_threads/_threadworker.py:49:work
/usr/local/lib/python3.8/dist-packages/twisted/_threads/_team.py:192:doWork
--- <exception caught here> ---
/usr/local/lib/python3.8/dist-packages/twisted/python/threadpool.py:269:inContext
/usr/local/lib/python3.8/dist-packages/twisted/python/threadpool.py:285:<lambda>
/usr/local/lib/python3.8/dist-packages/twisted/python/context.py:117:callWithContext
/usr/local/lib/python3.8/dist-packages/twisted/python/context.py:82:callWithContext
/usr/local/lib/python3.8/dist-packages/ocs/ocs_agent.py:984:_running_wrapper
/usr/local/lib/python3.8/dist-packages/socs/agents/hwp_supervisor/agent.py:1107:monitor
/usr/local/lib/python3.8/dist-packages/socs/agents/hwp_supervisor/agent.py:135:update
/usr/local/lib/python3.8/dist-packages/socs/agents/hwp_supervisor/agent.py:136:<dictcomp>
]
2024-06-11T20:37:44+0000 monitor:0 Status is now "done".

I haven't dug into this too deeply, but it seems like the label its looking for 'outletStatus_4' is the syntax for if the driver power agent is an IBootBar agent and not a synaccess agent.

I know we added the ability to select remote PDU type in #653, but maybe we're still hitting some edge case that misses support for this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions