Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRASH]OpenSIPS Crashes when using "reload_routes" #2382

Open
vtzan opened this issue Jan 22, 2021 · 16 comments
Open

[CRASH]OpenSIPS Crashes when using "reload_routes" #2382

vtzan opened this issue Jan 22, 2021 · 16 comments

Comments

@vtzan
Copy link

vtzan commented Jan 22, 2021

OpenSIPS version you are running

version: opensips 3.1.1 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, HP_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: b7cf71712
main.c compiled on 16:22:04 Jan 22 2021 with gcc 8

Crash Core Dump
https://pastebin.com/G2q31Hki

Describe the traffic that generated the bug

curl --location --request POST 'http://opensips_ip_address:8080/json' \
--header 'Content-Type: application/json' \
--data-raw '{
    "jsonrpc": "2.0",
    "method": "reload_routes",
    "id": 11
}'

To Reproduce

  1. start OpenSIPS
  2. send 2 times reload_routes using the curl command above
  3. wait ~20 seconds
  4. CRASH

Relevant System Logs

Jan 22 16:35:37 [842] ALERT:core:do_action: BUG - unknown type 1601399924
Jan 22 16:35:37 [842] ERROR:core:do_action: error in /efs/opensips-edge/opensips.cfg:5066
Jan 22 16:35:37 [842] ALERT:core:do_action: BUG - unknown type 1953460848
Jan 22 16:35:37 [842] ERROR:core:do_action: error in /efs/opensips-edge/opensips.cfg:5067
Jan 22 16:35:37 [842] ALERT:core:do_action: BUG - unknown type 1601399924
Jan 22 16:35:37 [842] CRITICAL:core:sig_usr: segfault in process pid: 842, id: 8
Jan 22 16:35:38 [818] NOTICE:presence:destroy: destroy module ...

OS/environment information

  • Operating System: Debian 10.7
  • OpenSIPS installation: 3.1

Thank you in advance for your support.

Vasilios Tzanoudakis

@bogdan-iancu bogdan-iancu self-assigned this Jan 25, 2021
@bogdan-iancu
Copy link
Member

hey @vtzan , thanks for the report. Do you do any changes on the file, before the reloads? I see the backtrace points to some script execution via timer route - could you do some tests to see if indeed the crash is triggered by the presence of such type of routes? Or do you have a minimal cfg that triggers the crash as reported here ?

@bogdan-iancu bogdan-iancu added this to the 3.1.2 milestone Jan 25, 2021
@vtzan
Copy link
Author

vtzan commented Jan 25, 2021

Dear @bogdan-iancu,

You are right! (once again)

After disabling the timer_route it doesn't crash anymore.

The timer route that triggers this crash is as simple as this :

timer_route [system_stats, 30] {
	xlog("L_NOTICE","Opensips Rules!\n");
}

Thank you in advance for your support.

Vasilios Tzanoudakis

@luminblade
Copy link

Just stumbled across this as well, timer_route crashing in 3.1.1.

@bogdan-iancu
Copy link
Member

@vtzan , I managed to reproduce the errors, let me dig in to find the problem.

@stale
Copy link

stale bot commented Jul 21, 2021

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@stale stale bot added the stale label Jul 21, 2021
@vtzan
Copy link
Author

vtzan commented Jul 27, 2021

Just FYI. I have switched to 3.2 and problem is still there.

Thank you in advance for your support.

Vasilios Tzanoudakis

@stale stale bot removed the stale label Jul 27, 2021
@github-actions
Copy link

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@github-actions github-actions bot added the stale label Aug 24, 2021
@bogdan-iancu
Copy link
Member

sorry for the delay on this, let me resume the digging into this......

@stale stale bot removed the stale label Oct 7, 2021
@bogdan-iancu
Copy link
Member

@vtzan , issue found - the timer routes are triggered as timer jobs where the pointer to the actions (the body of the timer router) is attached to the timer job. So, when a route reload happens, the timer jobs (with their pointers to actions) is not changed, pointing to some actions (routes) that do not exist anymore .
As solution, upon reload, we need to iterate the registered timer job, find the ones of "timer route" type and refresh the pointer to action (to perform again the translation from route name to route id/pointer)

@stale
Copy link

stale bot commented Jan 9, 2022

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@stale stale bot added the stale label Jan 9, 2022
@abieuzent
Copy link

Any update on this point ?

@stale stale bot removed the stale label Jan 28, 2022
@stale
Copy link

stale bot commented Apr 18, 2022

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@stale stale bot added the stale label Apr 18, 2022
@gmaruzz
Copy link

gmaruzz commented Apr 24, 2022

.

@stale stale bot removed the stale label Apr 24, 2022
@github-actions
Copy link

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

@github-actions github-actions bot added the stale label Sep 14, 2022
@bogdan-iancu
Copy link
Member

The underlying problem here is more systemic , not related only to the timer routes, but to other parts of the code that need to keep references to the routes, routes that may now change.
Let me put more prio into this, to find a generic solution here.

@bogdan-iancu
Copy link
Member

This should be fixed with fce0eae on head / 3.4 . Let's give it a try and if it holds, we can consider a bakport

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants