Skip to content

[CRASH] When one or more defined RTPEngines are down when OpenSIPS starts ... and an external script is called. #3335

@NormB

Description

@NormB

The segfault happens when OpenSIPS has one (or more) RTPEngines defined, however, at startup, one or more RTPEngines are down.

The E_RTPENGINE_STATUS event is being triggered with status "inactive". When this happens, an external script is being executed.

The external script issues some commands to determine the impact of the failed RTPEngine. For example, "rtpengine_show" is being executed.

When the external script is not run when the E_RTPENGINE_STATUS "inactive" is triggered, no segfault happens.

After testing multiple times, it appears the segfault only happens after this message is logged ERROR:rtpengine:send_rtpe_command: timeout waiting reply from a RTP proxy. Other calls to the badproxy: label do not appear to be causing the segfault.

OpenSIPS version you are running

version: opensips 3.5.0-dev (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, HP_MALLOC, DBG_MALLOC, CC_O0, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: 3a6a8240e
main.c compiled on 19:37:30 Mar  8 2024 with cc 12

Crash Core Dump

Describe the traffic that generated the bug

Added the following log lines to the send_rtpe_command() because core dumps are not being produced.
The TESTING logs are listed in the logs shown below.

badproxy:
        LM_ERR("TESTING 1\n");
        LM_ERR("proxy <%s> does not respond, disable it\n", node->rn_url.s);
        LM_ERR("TESTING 2\n");
        node->rn_disabled = 1;
        LM_ERR("TESTING 3\n");
        node->rn_recheck_ticks = get_ticks() + rtpengine_disable_tout;
        LM_ERR("TESTING 4\n");
        raise_rtpengine_status_event(node);
        LM_ERR("TESTING 5\n");

        return NULL;

To Reproduce

Relevant System Logs

Mar 8 14:38:38 [58] INFO:rtpengine:rtpe_test: rtp proxy udp:100.127.6.20:22222 found, support for it enabled
Mar 8 14:38:38 [58] INFO:rtpengine:rtpe_test: rtp proxy udp:100.127.6.21:22222 found, support for it enabled
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: can't send (#7 iovec buffers) command to a RTP proxy (111:Connection refused)
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: timeout waiting reply from a RTP proxy
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: TESTING 1
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: proxy udp:100.127.6.22:22222 does not respond, disable it
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: TESTING 2
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: TESTING 3
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: TESTING 4
Mar 8 14:38:38 [58] DBG:core:evi_param_set: set str socket='udp:100.127.6.22:22222'
Mar 8 14:38:38 [58] DBG:core:evi_param_set: set str status='inactive'
Mar 8 14:38:38 [58] DBG:core:get_dummy_sip_msg: reusing the static sip msg 0x7fbfa019bcf0
Mar 8 14:38:38 [58] DBG:core:evi_raise_event_msg: found subscriber E_RTPENGINE_STATUS
Mar 8 14:38:38 [58] DBG:core:dup_ref_script_route_in_shm: dupping 0x7fbf5d1daeb8 [E_RTPENGINE_STATUS], idx 2, ver/cnt 1, to new 0x7fbf5d1dd080 [E_RTPENGINE_STATUS], idx 2, ver/cnt 1
Mar 8 14:38:38 [18] DBG:core:ipc_handle_job: received job type 0[RPC] from process 41
Mar 8 14:38:38 [18] DBG:core:get_dummy_sip_msg: reusing the static sip msg 0x7fbfa019bcf0
Mar 8 14:38:38 [18] NOTICE:[E_RTPENGINE_STATUS] udp:100.127.6.22:22222: inactive
Mar 8 14:38:38 [18] DBG:core:comp_scriptvar: str 20: inactive
Mar 8 14:38:38 [58] DBG:core:release_dummy_sip_msg: cleaning the static sip msg 0x7fbfa019bcf0
Mar 8 14:38:38 [58] DBG:core:destroy_avp_list: destroying list (nil)
Mar 8 14:38:38 [58] ERROR:rtpengine:send_rtpe_command: TESTING 5
Mar 8 14:38:38 [58] ERROR:rtpengine:rtpe_test: proxy did not respond to ping
Mar 8 14:38:38 [18] DBG:core:async_script_launch: placing launch job into reactor
Mar 8 14:38:38 [18] DBG:core:io_watch_add: [UDP_worker] io_watch_add op (286 on 199) (0x5592bf1f1f40, 286, 3, 0x7fbf5d1dd340,1), fd_no=5/1048576
Mar 8 14:38:38 [58] DBG:rtpengine:connect_rtpengines: successfully updated rtpengine sets
Mar 8 14:38:38 [58] DBG:core:release_dummy_sip_msg: cleaning the static sip msg 0x7fbfa019bcf0
Mar 8 14:38:38 [58] DBG:core:destroy_avp_list: destroying list (nil)
Mar 8 14:38:38 [62] DBG:core:io_watch_add: [TCP_main] io_watch_add op (261 on 4) (0x5592bf1f1f40, 261, 22, 0x7fbf5d19b3f8,1), fd_no=38/1048576
Mar 8 14:38:38 [18] DBG:core:release_dummy_sip_msg: cleaning the static sip msg 0x7fbfa019bcf0
Mar 8 14:38:38 [18] DBG:core:destroy_avp_list: destroying list 0x7fbf5d1dd120
Mar 8 14:38:38 [62] DBG:core:io_watch_add: [TCP_main] io_watch_add op (263 on 4) (0x5592bf1f1f40, 263, 22, 0x7fbf5d19c658,1), fd_no=39/1048576
Mar 8 14:38:38 [28] ERROR:rtpengine:send_rtpe_command: timeout waiting reply from a RTP proxy
Mar 8 14:38:38 [28] ERROR:rtpengine:send_rtpe_command: TESTING 1
Mar 8 14:38:38 [28] CRITICAL:core:sig_usr: segfault in process pid: 28, id: 15
Mar 8 14:38:38 [28] DBG:core:restore_segv_handler: restoring SIGSEGV handler...
Mar 8 14:38:38 [28] DBG:core:restore_segv_handler: successfully restored system SIGSEGV handler
Mar 8 14:38:38 [1] DBG:core:handle_sigs: OpenSIPS exit status = 11
Mar 8 14:38:38 [1] INFO:core:handle_sigs: child process 28 exited by a signal 11
Mar 8 14:38:38 [1] INFO:core:handle_sigs: core was not generated
Mar 8 14:38:38 [1] INFO:core:handle_sigs: terminating due to SIGCHLD

OS/environment information

  • Operating System:
  • OpenSIPS installation:
  • other relevant information:

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions