Skip to content

[CRASH] Frequent crashes in rest_client (openssl error functions) - fix available #3388

@meengu

Description

@meengu

OpenSIPS version you are running

version: opensips 3.4.5 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, HP_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: ffdb1b473
main.c compiled on  with cc 10

openssl        1.1.1w-0+deb11u1 amd64
libcurl4:amd64 7.74.0-1.3+deb11u11

Crash Core Dump

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fcf98a58537 in __GI_abort () at abort.c:79
#2  0x00005652d01e1cf7 in fm_free (fm=<optimized out>, p=<optimized out>, file=<optimized out>, func=<optimized out>, line=<optimized out>)
    at mem/f_malloc_dyn.h:240
#3  0x00007fcf93a17f75 in _shm_free (line=935, function=<synthetic pointer>, file=0x7fcf938e2a44 "../crypto/err/err.c", ptr=0x7fcf951c7828)
    at ../../mem/shm_mem.h:549
#4  os_free (ptr=0x7fcf951c7828, file=0x7fcf938e2a44 "../crypto/err/err.c", line=935) at openssl_helpers.h:119
#5  0x00007fcf937fd3b4 in ERR_pop_to_mark () at ../crypto/err/err.c:935
#6  0x00007fcf938a33e5 in pubkey_cb (pval=<optimized out>, pval=<optimized out>, operation=5, it=0x7fcf93967ca0 <X509_PUBKEY_it>, exarg=0x0)
    at ../crypto/x509/x_pubkey.c:48
#7  pubkey_cb (operation=operation@entry=5, pval=pval@entry=0x7fcf954871b0, it=it@entry=0x7fcf93967ca0 <X509_PUBKEY_it>, exarg=exarg@entry=0x0)
    at ../crypto/x509/x_pubkey.c:29
#8  0x00007fcf9373d689 in asn1_item_embed_d2i (pval=pval@entry=0x7fcf954871b0, in=0x7ffd2cd11478, len=<optimized out>, it=<optimized out>,
    tag=<optimized out>, tag@entry=-1, aclass=<optimized out>, aclass@entry=0, opt=0 '\000', ctx=0x7ffd2cd11840, depth=3) at ../crypto/asn1/tasn_dec.c:432
#9  0x00007fcf9373dbb6 in asn1_template_noexp_d2i (val=0x7fcf954871b0, in=0x7ffd2cd115b0, len=<optimized out>,
    tt=tt@entry=0x7fcf93971790 <X509_CINF_seq_tt+240>, opt=<optimized out>, ctx=0x7ffd2cd11840, depth=2) at ../crypto/asn1/tasn_dec.c:643
#10 0x00007fcf9373de21 in asn1_template_ex_d2i (val=val@entry=0x7fcf954871b0, in=in@entry=0x7ffd2cd115b0, inlen=<optimized out>,
    tt=tt@entry=0x7fcf93971790 <X509_CINF_seq_tt+240>, opt=<optimized out>, ctx=ctx@entry=0x7ffd2cd11840, depth=2) at ../crypto/asn1/tasn_dec.c:518
#11 0x00007fcf9373d4c8 in asn1_item_embed_d2i (pval=pval@entry=0x7ffd2cd11640, in=0x7ffd2cd11648, len=<optimized out>, it=0x7fcf93967e60 <X509_CINF_it>,
    tag=<optimized out>, tag@entry=-1, aclass=<optimized out>, aclass@entry=0, opt=0 '\000', ctx=0x7ffd2cd11840, depth=2) at ../crypto/asn1/tasn_dec.c:382
#12 0x00007fcf9373dbb6 in asn1_template_noexp_d2i (val=0x7ffd2cd11640, in=0x7ffd2cd11780, len=<optimized out>, tt=tt@entry=0x7fcf93971620 <X509_seq_tt>,
    opt=<optimized out>, ctx=0x7ffd2cd11840, depth=1) at ../crypto/asn1/tasn_dec.c:643
#13 0x00007fcf9373de21 in asn1_template_ex_d2i (val=val@entry=0x7fcf95487160, in=in@entry=0x7ffd2cd11780, inlen=<optimized out>,
    tt=tt@entry=0x7fcf93971620 <X509_seq_tt>, opt=<optimized out>, ctx=ctx@entry=0x7ffd2cd11840, depth=1) at ../crypto/asn1/tasn_dec.c:518
#14 0x00007fcf9373d4c8 in asn1_item_embed_d2i (pval=pval@entry=0x7fcf95b8c868, in=0x7ffd2cd11900, len=<optimized out>, it=it@entry=0x7fcf93967de0 <X509_it>,
    tag=<optimized out>, tag@entry=-1, aclass=<optimized out>, aclass@entry=0, opt=0 '\000', ctx=0x7ffd2cd11840, depth=1) at ../crypto/asn1/tasn_dec.c:382
#15 0x00007fcf9373df4e in ASN1_item_ex_d2i (pval=pval@entry=0x7fcf95b8c868, in=<optimized out>, len=<optimized out>, it=0x7fcf93967de0 <X509_it>,
    tag=tag@entry=-1, aclass=aclass@entry=0, opt=0 '\000', ctx=0x7ffd2cd11840) at ../crypto/asn1/tasn_dec.c:124
#16 0x00007fcf9373dfcb in ASN1_item_d2i (pval=0x7fcf95b8c868, in=<optimized out>, len=<optimized out>, it=<optimized out>) at ../crypto/asn1/tasn_dec.c:114
#17 0x00007fcf93831bac in PEM_X509_INFO_read_bio (bp=bp@entry=0x7fcf952d8540, sk=sk@entry=0x0, cb=cb@entry=0x0, u=u@entry=0x7fcf938f2a79)
    at ../crypto/pem/pem_info.c:195
#18 0x00007fcf93893093 in X509_load_cert_crl_file (type=1, file=<optimized out>, ctx=0x7fcf94f35ea0) at ../crypto/x509/by_file.c:202
#19 X509_load_cert_crl_file (ctx=0x7fcf94f35ea0, file=<optimized out>, type=<optimized out>) at ../crypto/x509/by_file.c:188
#20 0x00007fcf938931e2 in by_file_ctrl (ctx=0x7fcf94f35ea0, cmd=<optimized out>, argp=<optimized out>, argl=<optimized out>, ret=<optimized out>)
    at ../crypto/x509/by_file.c:64
#21 0x00007fcf93895faf in X509_STORE_load_locations (ctx=0x7fcf954b03f8, file=file@entry=0x5652d1112d90 "/etc/ssl/certs/ca-certificates.crt",
    path=path@entry=0x5652d10e4bb0 "/etc/ssl/certs") at ../crypto/x509/x509_d2.c:44
#22 0x00007fcf939ba3e9 in SSL_CTX_load_verify_locations (ctx=<optimized out>, CAfile=CAfile@entry=0x5652d1112d90 "/etc/ssl/certs/ca-certificates.crt",
    CApath=CApath@entry=0x5652d10e4bb0 "/etc/ssl/certs") at ../ssl/ssl_lib.c:4197
#23 0x00007fcf92b4fa03 in ossl_connect_step1 (conn=conn@entry=0x5652d11255d0, sockindex=sockindex@entry=0) at vtls/openssl.c:3065
#24 0x00007fcf92b50f4f in ossl_connect_common (conn=0x5652d11255d0, sockindex=0, nonblocking=true, done=0x7ffd2cd11d68) at vtls/openssl.c:4041
#25 0x00007fcf92b52011 in Curl_ssl_connect_nonblocking (conn=0x5652d11255d0, sockindex=0, done=0x7ffd2cd11d68) at vtls/vtls.c:362
#26 0x00007fcf92b11815 in https_connecting (done=<optimized out>, conn=<optimized out>) at http.c:1506
#27 Curl_http_connect (done=<optimized out>, conn=<optimized out>) at http.c:1433
#28 Curl_http_connect (conn=0x5652d11255d0, done=0x7ffd24 0x00007fcf92b50f4f in ossl_connect_common (conn=0x5652d11255d0, sockindex=0, nonblocking=true, done=0
#25 0x00007fcf92b52011 in Curl_ssl_connect_nonblocking (conn=0x5652d11255d0, sockindex=0, done=0x7ffd2cd11d68) at vtls/vtls.c:362
#26 0x00007fcf92b11815 in https_connecting (done=<optimized out>, conn=<optimized out>) at http.c:1506
#27 Curl_http_connect (done=<optimized out>, conn=<optimized out>) at http.c:1433
#28 Curl_http_connect (conn=0x5652d11255d0, done=0x7ffd0b960) at easy.c:683
#34 curl_easy_perform (data=data@entry=0x5652d110b960) at easy.c:702
#35 0x00007fcf92b8f86c in rest_easy_perform (handle=0x5652d110b960, url=url@entry=0x7fcf962e4230 "https://pnproxy.bc4.cz/v1/push-message",
    out_http_rc=out_http_rc@entry=0x7ffd2cd12088) at rest_methods.c:649
#36 0x00007fcf92b92e78 in rest_sync_transfer (method=method@entry=REST_CLIENT_POST, msg=msg@entry=0x7fcf962e27b0,
    url=url@entry=0x7fcf962e4230 "https://pnproxy.bc4.cz/v1/push-message", body=body@entry=0x7ffd2cd12fd0, ctype=ctype@entry=0x7ffd2cd12e00,
    body_pv=body_pv@entry=0x7fcf9626c1d0, ctype_pv=0x7fcf9626c3d0, code_pv=0x7fcf9626c5d0) at rest_methods.c:740
#37 0x00007fcf92b8c2f9 in w_rest_post (msg=0x7fcf962e27b0, url=<optimized out>, body=0x7ffd2cd12fd0, _ctype=0x7ffd2cd12fe8, body_pv=0x7fcf9626c1d0,
    ctype_pv=0x7fcf9626c3d0, code_pv=0x7fcf9626c5d0) at rest_client.c:559
#38 0x00005652d01e8525 in do_action (a=0x7fcf9626c718, msg=0x7fcf962e27b0) at action.c:1048
#39 0x00005652d01ea100 in run_action_list (a=a@entry=0x7fcf962676f0, msg=msg@entry=0x7fcf962e27b0) at action.c:190
#40 0x00005652d01ea613 in run_actions (msg=0x7fcf962e27b0, a=0x7fcf962676f0) at action.c:136
#41 run_top_route (sr=..., msg=msg@entry=0x7fcf962e27b0) at action.c:250
#42 0x00007fcf92be5db3 in route_run (route=..., msg=0x7fcf962e27b0, params=<optimized out>, event=<optimized out>) at event_route.c:346
#43 0x00007fcf92be62a3 in route_received (sender=2, param=0x7fcf956ce8e0) at route_send.c:144
#44 0x00005652d02178e9 in ipc_handle_job (fd=<optimized out>) at ipc.c:304
#45 0x00005652d03705a8 in handle_io (fm=0x7fcf962a2c38, idx=1, event_type=1) at net/net_tcp_proc.c:217
#46 0x00005652d0371d11 in io_wait_loop_epoll (h=<optimized out>, t=<optimized out>, repeat=<optimized out>) at net/../io_wait_loop.h:305
#47 tcp_worker_proc_loop () at net/net_tcp_proc.c:442
#48 0x00005652d036a406 in tcp_start_processes (chd_rank=chd_rank@entry=0x5652d04d9a58 <chd_rank>, startup_done=startup_done@entry=0x0) at net/net_tcp.c:2119
#49 0x00005652d01e4986 in main_loop () at main.c:243
#50 main (argc=<optimized out>, argv=<optimized out>) at main.c:966

What is interesting is that the crash is in err_clear():

925 int ERR_pop_to_mark(void)
926 {
927     ERR_STATE *es;
928
929     es = ERR_get_state();
930     if (es == NULL)
931         return 0;
932
933     while (es->bottom != es->top
934            && (es->err_flags[es->top] & ERR_FLAG_MARK) == 0) {
935         err_clear(es, es->top);
936         es->top = es->top > 0 ? es->top - 1 : ERR_NUM_ERRORS - 1;
937     }
938
939     if (es->bottom == es->top)
940         return 0;
941     es->err_flags[es->top] &= ~ERR_FLAG_MARK;
942     return 1;
943 }

But the code execution should not get there because printing the content of *es returns that top==bottom:

(gdb) frame 5
#5  0x00007fcf937fd3b4 in ERR_pop_to_mark () at ../crypto/err/err.c:935

(gdb) set print pretty on
(gdb) p *es
$3 = {
  err_flags = {0 <repeats 16 times>},
  err_buffer = {0 <repeats 16 times>},
  err_data = {0x0 <repeats 16 times>},
  err_data_flags = {0 <repeats 16 times>},
  err_file = {0x0 <repeats 16 times>},
  err_line = {-1 <repeats 16 times>},
  top = 0,
  bottom = 0
}

Which leads me to an idea that this is a race condition.

Looks like that opensips first initializes modules (including tls_openssl/OpenSSL in general) and then forks TCP worker processes. Initialization of modules will make openssl create ERR_STATE struct in shared memory, and this will get then shared by all workers, sometimes leading to race conditions due to absent locking when accessing the now shared struct. I don't think OPENSSL expects this data to be shared among threads/processes. Manpage of ERR_pop_to_mark indicates that it's normally a per-thread data structure.

I found a similar issue here: kamailio/kamailio#3695

Due to this being a race condition it only happens in specific usage scenarios (either high load, or in my case due to issuing several HTTP requests via rest_client in parallel from multiple workers - push notification being sent to multiple mobile devices at once)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions