Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crashes in pthread_mutex_unlock #1199

Open
jan-cerny opened this issue Oct 1, 2018 · 5 comments
Open

crashes in pthread_mutex_unlock #1199

jan-cerny opened this issue Oct 1, 2018 · 5 comments
Labels
Milestone

Comments

@jan-cerny
Copy link
Member

Description of Problem:

oscap sometimes crashes randomly due to an abort in pthread_mutex_unlock.
It happens in multiple XCCDF "unit tests" when they run command oscap xccdf eval --remediate.

OpenSCAP Version:

current master, as of 1st October 2018

Operating System & Version:

Fedora 28 Server

Steps to Reproduce:

  1. cmake -DCMAKE_BUILD_TYPE=Debug .. && make
  2. ctest --verbose -j4 -R API/XCCDF/unittests/all.sh
  3. run again and again until it crashes
  4. coredumpctl

Actual Results:

crash with a coredump

           PID: 30187 (oscap)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Mon 2018-10-01 12:47:57 CEST (10min ago)
  Command Line: /home/user/openscap/build/utils/oscap xccdf eval --remediate --results /tmp/test_remediation_bad_fix.out.CJvy44 /home/user/openscap/tests/API/XCCDF/unittests/test_remediation_bad_fix.xccdf.xml
    Executable: /home/user/openscap/build/utils/oscap
 Control Group: /user.slice/user-0.slice/session-3.scope
          Unit: session-3.scope
         Slice: user-0.slice
       Session: 3
     Owner UID: 0 (root)
       Boot ID: 72e2da705e284977bbdd33f6741adbe3
    Machine ID: a4187eabc7a340a1853e60d5ba212501
      Hostname: localhost.localdomain
       Storage: /var/lib/systemd/coredump/core.oscap.0.72e2da705e284977bbdd33f6741adbe3.30187.1538390877000000.lz4
       Message: Process 30187 (oscap) of user 0 dumped core.
                
                Stack trace of thread 30197:
                #0  0x00007f7414528eab raise (libc.so.6)
                #1  0x00007f74145135b9 abort (libc.so.6)
                #2  0x00007f7414513491 __assert_fail_base.cold.0 (libc.so.6)
                #3  0x00007f7414521612 __assert_fail (libc.so.6)
                #4  0x00007f741630acbe __pthread_tpp_change_priority (libpthread.so.0)
                #5  0x00007f74172be278 n/a (/home/user/openscap/build/src/libopenscap.so.24.0.0)
                #6  0x00007f74172c0537 n/a (/home/user/openscap/build/src/libopenscap.so.24.0.0)
                #7  0x00007f74172c16fb n/a (/home/user/openscap/build/src/libopenscap.so.24.0.0)
                #8  0x00007f74172c1962 n/a (/home/user/openscap/build/src/libopenscap.so.24.0.0)
                #9  0x00007f74172b5451 n/a (/home/user/openscap/build/src/libopenscap.so.24.0.0)
                #10 0x00007f74162ff594 start_thread (libpthread.so.0)
                #11 0x00007f74145ebe6f __clone (libc.so.6)
                
                Stack trace of thread 30187:
                #0  0x00007f7416ea1dc4 nodePop (libxml2.so.2)
                #1  0x00007f7416f630ee xmlSAX2EndElementNs (libxml2.so.2)
                #2  0x00007f7416eaddd6 xmlParseEndTag2 (libxml2.so.2)
                #3  0x00007f7416eb291c xmlParseElement (libxml2.so.2)
                #4  0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #5  0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #6  0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #7  0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #8  0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #9  0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #10 0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #11 0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #12 0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #13 0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #14 0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #15 0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #16 0x00007f7416eb2006 xmlParseContent (libxml2.so.2)
                #17 0x00007f7416eb28b9 xmlParseElement (libxml2.so.2)
                #18 0x00007f7416eb2fba xmlParseDocument (libxml2.so.2)

BT from gdb:

#0  0x00007f7414528eab in raise () at /lib64/libc.so.6
#1  0x00007f74145135b9 in abort () at /lib64/libc.so.6
#2  0x00007f7414513491 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x00007f7414521612 in  () at /lib64/libc.so.6
#4  0x00007f741630acbe in  () at /lib64/libpthread.so.0
#5  0x00007f74172be278 in SEAP_desc_unlock (m=0x7f73f80025a8) at /home/user/openscap/src/OVAL/probes/SEAP/seap-descriptor.h:118
#6  0x00007f74172c0537 in SEAP_packet_send (ctx=0x7f73f8000b50, sd=32, packet=0x7f73f4002da0) at /home/user/openscap/src/OVAL/probes/SEAP/seap-packet.c:878
#7  0x00007f74172c16fb in SEAP_sendmsg (ctx=0x7f73f8000b50, sd=32, seap_msg=0x7f73f4002ce0) at /home/user/openscap/src/OVAL/probes/SEAP/seap.c:419
#8  0x00007f74172c1962 in SEAP_reply (ctx=0x7f73f8000b50, sd=32, rep_msg=0x7f73f4002ce0, req_msg=0x1fce8b0) at /home/user/openscap/src/OVAL/probes/SEAP/seap.c:477
#9  0x00007f74172b5451 in probe_worker_runfn (arg=0x1fd6ad0) at /home/user/openscap/src/OVAL/probes/probe/worker.c:126
#10 0x00007f74162ff594 in start_thread () at /lib64/libpthread.so.0
#11 0x00007f74145ebe6f in clone () at /lib64/libc.so.6

Expected Results:

no crash

Additional Information / Debugging Steps:

@matejak suggested using https://divine.fi.muni.cz/ to investigate

@jan-cerny jan-cerny added the bug label Oct 1, 2018
@jan-cerny jan-cerny added this to the 1.3.0_alpha3 milestone Oct 1, 2018
This was referenced Oct 1, 2018
@jan-cerny
Copy link
Member Author

I have done git bisect and I got 16995cb

@jan-cerny
Copy link
Member Author

The problem is less likely to happen if you don't parallelize ctest - i.e. if you run it on just a single core ( you don't add -j option).

jan-cerny added a commit to jan-cerny/openscap that referenced this issue Oct 3, 2018
This will inform us when some of the locks fails.
It can help us with debugging OpenSCAP#1199.
@jan-cerny
Copy link
Member Author

I think that the commit found by git bisect is not the right commit.

@jan-cerny jan-cerny modified the milestones: 1.3.0_alpha3, 1.3.0, 1.3.1 Oct 8, 2018
@matejak
Copy link
Contributor

matejak commented Oct 12, 2018

@jan-cerny jan-cerny modified the milestones: 1.3.1, 1.3.2 Jun 13, 2019
@evgenyz evgenyz modified the milestones: 1.3.2, 1.3.3 Jan 14, 2020
@evgenyz evgenyz modified the milestones: 1.3.3, 1.3.4 Apr 29, 2020
@evgenyz
Copy link
Contributor

evgenyz commented Jul 10, 2020

@jan-cerny I can't reproduce the bug in latest maint-1.3, can you please give it another try? Maybe I have something different in the environment.

@evgenyz evgenyz modified the milestones: 1.3.4, 1.3.5 Oct 1, 2020
@evgenyz evgenyz modified the milestones: 1.3.5, 1.3.6 Apr 23, 2021
@evgenyz evgenyz removed this from the 1.3.6 milestone Jan 19, 2022
@evgenyz evgenyz added this to the 1.3.7 milestone Jan 19, 2022
@jan-cerny jan-cerny modified the milestones: 1.3.7, 1.3.8 Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants