Skip to content

Add singlethreaded (compiler-only) fences. #888

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 6, 2015
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions text/0000-compiler-fence-intrinsics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
- Feature Name: compiler_fence_intrinsics
- Start Date: 2015-02-19
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

Add intrinsics for single-threaded memory fences.

# Motivation

Rust currently supports memory barriers through a set of intrinsics,
`atomic_fence` and its variants, which generate machine instructions and are
suitable as cross-processor fences. However, there is currently no compiler
support for single-threaded fences which do not emit machine instructions.

Certain use cases require that the compiler not reorder loads or stores across a
given barrier but do not require a corresponding hardware guarantee, such as
when a thread interacts with a signal handler which will run on the same thread.
By omitting a fence instruction, relatively costly machine operations can be
avoided.

The C++ equivalent of this feature is `std::atomic_signal_fence`.

# Detailed design

Add four language intrinsics for single-threaded fences:

* `atomic_compilerfence`
* `atomic_compilerfence_acq`
* `atomic_compilerfence_rel`
* `atomic_compilerfence_acqrel`

These have the same semantics as the existing `atomic_fence` intrinsics but only
constrain memory reordering by the compiler, not by hardware.

The existing fence intrinsics are exported in libstd with safe wrappers, but
this design does not export safe wrappers for the new intrinsics. The existing
fence functions will still perform correctly if used where a single-threaded
fence is called for, but with a slight reduction in efficiency. Not exposing
these new intrinsics through a safe wrapper reduces the possibility for
confusion on which fences are appropriate in a given situation, while still
providing the capability for users to opt in to a single-threaded fence when
appropriate.

# Alternatives

* Do nothing. The existing fence intrinsics support all use cases, but with a
negative impact on performance in some situations where a compiler-only fence
is appropriate.

* Recommend inline assembly to get a similar effect, such as `asm!("" :::
"memory" : "volatile")`. LLVM provides an IR item specifically for this case
(`fence singlethread`), so I believe taking advantage of that feature in LLVM is
most appropriate, since its semantics are more rigorously defined and less
likely to yield unexpected (but not necessarily wrong) behavior.

# Unresolved questions

These intrinsics may be better represented with a different name, such as
`atomic_signal_fence` or `atomic_singlethread_fence`. The existing
implementation of atomic intrinsics in the compiler precludes the use of
underscores in their names and I believe it is clearer to refer to this
construct as a "compiler fence" rather than a "signal fence" because not all use
cases necessarily involve signal handlers, hence the current choice of name.