Skip to content

Stall in opal_fifo test #3450

@nmorey

Description

@nmorey

Thank you for taking the time to submit an issue!

Background information

I'm packaging openmpi 2.1 for SUSE and end up hitting a bug (probably GCC's fault) in the test suite

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

OpenMPI 2.1.0

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Built using tarball from the website

Please describe the system on which you are running

  • Operating system/version:
    openSUSE_Tumbleweed
    gcc (SUSE Linux) 6.3.1 20170202 [gcc-6-branch revision 245119]

  • Computer hardware:
    i586

  • Network type:


Details of the problem

Running the opal_fifo test (through make check) stalls.
After some debugging, it appears we end up into a broken loop here (opal_fifo.h:262)

       if (!opal_atomic_cmpset_ptr (&fifo->opal_fifo_tail.data.item, item, &fifo->opal_fifo_ghost)) {
            while (&fifo->opal_fifo_ghost == item->opal_list_next) {
                opal_atomic_rmb ();
            }

Looking into the generated assembly, it looks something like this

=> 0x08049675 <+357>:	cmp    %edi,%eax
   0x08049677 <+359>:	je     0x8049675 <thread_test+357>

which unless I'm mistaken means that GCC cached the value and doesn't load them from memory anymore.

The rmb used comes from gcc builtins.
Simply adding this:

diff --git a/opal/include/opal/sys/gcc_builtin/atomic.h b/opal/include/opal/sys/gcc_builtin/atomic.h
index 82b75f47d8..eea743503c 100644
--- a/opal/include/opal/sys/gcc_builtin/atomic.h
+++ b/opal/include/opal/sys/gcc_builtin/atomic.h
@@ -51,6 +51,9 @@ static inline void opal_atomic_mb(void)
 
 static inline void opal_atomic_rmb(void)
 {
+#if OPAL_ASSEMBLY_ARCH == OPAL_IA32
+    __asm__ __volatile__("": : :"memory");
+#endif
     __atomic_thread_fence (__ATOMIC_ACQUIRE);
 }

fixes the issue.

This really seems like a GCC bug, but I figured it might be worth notifying you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions