Skip to content

nontemporal stores behave incorrectly in their interaction with concurrency primitives #64521

Open
@RalfJung

Description

@RalfJung

The LLVM LangRef doesn't document how !nontemporal stores are intended to interact with concurrency primitives. The current interactions are extremely surprising, basically making !nontemporal stores even less ordered than "non-atomic" stores:

Thread A:
  store i32 %v, ptr %p, !nontemporal !13
  fence release
  // set some global flag (relaxed write)

Thread B:
  // wait till global flag is set (relaxed read)
  fence acquire
  %_0 = load i32, ptr %p, !noundef !11

According to all the usual concurrency rules, that last load must see the store. However, the way LLVM compiles this program on x86, it has a data race: the fences become NOPs and the relaxed accesses become regular MOV, so we end up with MOVNT; MOV in thread A, which the CPU is allowed to reorder (see e.g. this long and detailed post on MOVNT) -- meaning that thread B might see the flag write but then fail to see the data store!

In other words, MOVNT violates TSO, but the compilation scheme LLVM (and everyone else) uses for release/acquire synchronization relies on TSO. Together this leads to rather unpredictable semantics. Are nontemporal stores meant to completely bypass normal memory model rules (in which case they are super dangerous to use anywhere), or are they meant to follow the usual rules (in which case LLVM needs to ensure there is an sfence between each nontemporal store and later release operations)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions