-
Notifications
You must be signed in to change notification settings - Fork 633
Description
OpenSIPS version you are running
flags: STATS: On, EXTRA_DEBUG, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, QM_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: d509504
main.c compiled on with gcc 4.8
Describe the bug
I have a bunch of proxies which exchange data via clusterer module.
At some point random opensips stops reading data from a random TCP connection SIP clients start to receive 408 timeouts. After connection lifetime ends new one to the same destination works fine.
To Reproduce
It just randomly happens and failed to understand the pattern.
Expected behavior
Relevant System Logs
Jun 4 07:34:32 fr /usr/sbin/opensips[22077]: CRITICAL:core:io_watch_add_dbg: #12>>> [TCP_main] BUG trying to overwrite entry 720 in the hash(720, 19, 0x7f96c9218e90,1) with (720, 19, 0x7f96cb09a910,1)#12#012It seems you have hit a programming bug.#012Please help us make OpenSIPS better by reporting it at https://github.com/OpenSIPS/opensips/issues#012
Jun 4 10:35:48 fr /usr/sbin/opensips[22077]: ERROR:core:io_watch_add_dbg: [TCP_main] epoll_ctl MOD failed: No such file or directory [2]
OS/environment information
- Operating System: Ubuntu 14.04.6 LTS
- OpenSIPS installation: manually built latest 2.4 branch
- other relevant information:
Additional context
From what I could understand so far some proto_bin connections used by clusterer module have strange behavior and somehow overwrite a slot in fd_map of io loop in TCP main. When a valid connection wants to add/change IO_WATCH_WRITE/IO_WATCH_READ it fails because the fd_map slot corresponding to connection's fd is already used.