Skip to content

[CRASH] core dump in next_branches --> search_next_avp in 2.4.8 (nightly build) #2336

@rrevels-bw

Description

@rrevels-bw

OpenSIPS version you are running

version: opensips 2.4.8 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
main.c compiled on 02:33:06 Oct 15 2020 with gcc 4.8.5

Crash Core Dump

Our current configuration doesnt give us the symbol files so I will have to update this tracker as we make progress fixing that. Here is the function tree that I do have access to:
(gdb) bt full
#0 0x00000000004f3ef1 in search_next_avp ()
No symbol table info available.
#1 0x00000000004973ef in next_branches ()
No symbol table info available.
#2 0x00000000004a253e in do_action ()
No symbol table info available.
#3 0x00000000004aacf1 in run_action_list ()
[...]

Describe the traffic that generated the bug

This appears to be memory corruption that is found when attempting to fail over to the next branch (serial) that was gotten from a 300 earlier. We can go weeks between crashes but do seem to see the problem on all proxies in the network sooner or later.

To Reproduce

Start traffic that will get a 300 redirect and then have the first attempt fail. At some point the pointer to the current contact will get corrupted and cause a general protection error when accessed.

Relevant System Logs

We see no warning what-so-ever leading up to the crash.

OS/environment information

  • Operating System: Cent OS 7
  • OpenSIPS installation: opensips compile from source (and rpm build) from nightly build opensips-2.4.8-09142020.tar.gz
  • other relevant information:

Additional context

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions