Skip to content

Implementation of opal atomic barriers in x86_64 #8532

@gkatev

Description

@gkatev

Hi,
I'm wondering about the implementation of memory barriers (opal_atomic_) on x86_64.
With the gcc builtins, I see that opal_atomic_rmb() is implemented as:

static inline void opal_atomic_rmb(void)
{
#if OPAL_ASSEMBLY_ARCH == OPAL_X86_64
    /* work around a bug in older gcc versions where ACQUIRE seems to get
     * treated as a no-op instead of being equivalent to
     * __asm__ __volatile__("": : :"memory") */
    __atomic_thread_fence (__ATOMIC_SEQ_CST);
#else
    __atomic_thread_fence (__ATOMIC_ACQUIRE);
#endif
}

Related: #6014

On a x86_64 VM with GCC 10.2.0, OpenMPI 4.1.0 (git, debug on) I can see that the generated assembly contains an mfence(aka full memory fence):

$ (gdb) x/10i opal_atomic_rmb
   0x4262 <opal_atomic_rmb>:	push   %rbp
   0x4263 <opal_atomic_rmb+1>:	mov    %rsp,%rbp
   0x4266 <opal_atomic_rmb+4>:	mfence 
   0x4269 <opal_atomic_rmb+7>:	nop
   0x426a <opal_atomic_rmb+8>:	pop    %rbp
   0x426b <opal_atomic_rmb+9>:	retq   
checking for __atomic builtin atomics... yes

(I also observed inline mfences on another system with an older gcc (4.8.5 I think) and 4.1.0 (non-debug) )

It is my understanding that by default, x86_64 does not reorder load-load and load-store(the two cases for which opal_atomic_rmb() is intended?), so is an mfence required or should just a compiler barrier be enough?
(I haven't checked for performance differences with/without mfence.)

Furthermore, in the non-gcc-builtin implementation of opal_atomic_mb() for x86_64, a compiler barrier is utilized:

#define MB() __asm__ __volatile__("": : :"memory")
static inline void opal_atomic_mb(void)
{
    MB();
}

Considering opal_atomic_mb() is intended as a full barrier(?), will this be enough? Or is an mfence required here, to protect again store-load reordering, which AFAIK x86 can/will do.

Some of my sources:
https://preshing.com/20120710/memory-barriers-are-like-source-control-operations/
https://preshing.com/20120515/memory-reordering-caught-in-the-act/

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions