Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ospfd: Do not turn on write thread unless we have something in it #4907

Merged
merged 3 commits into from
Sep 3, 2019

Conversation

donaldsharp
Copy link
Member

I am rarely seeing this crash:

r2: ospfd crashed. Core file found - Backtrace follows:
[New LWP 32748]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/ospfd'.
Program terminated with signal SIGABRT, Aborted.
2019-08-29 15:59:36,149 ERROR: assert failed at "test_ospf_sr_topo1/test_memory_leak":

Which translates to this code:

node = listhead(ospf->oi_write_q);
assert(node);
oi = listgetdata(node);
assert(oi);

So if we get into ospf_write without anything on the oi_write_q
we are stopping the program.

The only place that I see where we could possibly have this condition
was in OSPF_ISM_WRITE_ON(O). Modify the macro to not just blindly
turn on ospf_write since we do not always add to the oi_write_q.

Signed-off-by: Donald Sharp sharpd@cumulusnetworks.com

@LabN-CI
Copy link
Collaborator

LabN-CI commented Aug 30, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/4907 275e0ed
Date 08/30/2019
Start 06:10:18
Finish 06:31:59
Run-Time 21:41
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-08-30-06:10:18.txt
Log autoscript-2019-08-30-06:11:11.log.bz2
Memory 397 429 360

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-8745/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 4907, comparing to Git base SHA f4574b4

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-8745/artifact/shared/static_analysis/index.html

I am rarely seeing this crash:

r2: ospfd crashed. Core file found - Backtrace follows:
[New LWP 32748]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/ospfd'.
Program terminated with signal SIGABRT, Aborted.
2019-08-29 15:59:36,149 ERROR: assert failed at "test_ospf_sr_topo1/test_memory_leak":

Which translates to this code:

	node = listhead(ospf->oi_write_q);
	assert(node);
	oi = listgetdata(node);
	assert(oi);

So if we get into ospf_write without anything on the oi_write_q
we are stopping the program.

This is happening because in ospf_ls_upd_queue_send we are calling
ospf_write.  Imagine that we have a interface already on the on_write_q
and then ospf_write handles the packet send for all functions.  We
are not clearing the t_write thread and we are popping and causing
a crash.

Additionally modify OSPF_ISM_WRITE_ON(O) to not just blindly
turn on the t_write thread.  Only do so if we have data.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>

ospfd: Remove redundant asserts

assert(oi) is impossible all listgetdata(node) directly proceeding
it already asserts here, besides a node cannot be created
with a null pointer!

If list_isempty is called directly before the listhead call
it is impossilbe that we do not have a valid pointer here.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This looks like a finish up of the partial cleanup that
ocurred at some point in time in the past.  When we
alloc oi also always alloc the oi->obuf.  When we delete
oi always delete the oi->obuf right before.

This cleans up a bunch of code to be simpler and hopefully
easier to follow.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
@LabN-CI
Copy link
Collaborator

LabN-CI commented Aug 30, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/4907 36a106e
Date 08/30/2019
Start 16:55:24
Finish 17:17:06
Run-Time 21:42
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-08-30-16:55:24.txt
Log autoscript-2019-08-30-16:56:17.log.bz2
Memory 426 435 360

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-8750/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 4907, comparing to Git base SHA da43609

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-8750/artifact/shared/static_analysis/index.html

Copy link
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these look fine to me

@riw777 riw777 merged commit a3aea4e into FRRouting:master Sep 3, 2019
@donaldsharp donaldsharp deleted the ospf_write_q branch December 9, 2019 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants