Skip to content

[LangRef] Relax semantics of writeonly / memory(write) #95238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 19, 2024

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Jun 12, 2024

Instead of making writes immediate undefined behavior, consider these attributes in terms of their externally observable effects. We don't care if a location is read within the function, as long as it has no impact on observed behavior. In particular, allow:

  • Reading a location after writing it.
  • Reading a location before writing it (within the function) returns a poison value.

The latter could be further relaxed to also allow things like "reading the value and then writing it back", but I'm not sure how one would specify that operationally (so that proof checkers can verify it).

While here, also explicitly mention the fact that reads and writes to allocas and read from constant globals are memory(none).

Fixes #95152.

Instead of making writes immediate undefined behavior, consider
these attributes in terms of their externally observable effects.
We don't care if a location is read within the function, as long
as it has no impact on observed behavior. In particular, allow:

 * Reading a location after writing it.
 * Reading a location before writing it (within the function)
   returns a poison value.

The latter could be further relaxed to also allow things like
"reading the value and then writing it back", but I'm not sure
how one would specify that operationally (so that proof checkers
can verify it).

Fixes llvm#95152.
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-ir

Author: Nikita Popov (nikic)

Changes

Instead of making writes immediate undefined behavior, consider these attributes in terms of their externally observable effects. We don't care if a location is read within the function, as long as it has no impact on observed behavior. In particular, allow:

  • Reading a location after writing it.
  • Reading a location before writing it (within the function) returns a poison value.

The latter could be further relaxed to also allow things like "reading the value and then writing it back", but I'm not sure how one would specify that operationally (so that proof checkers can verify it).

While here, also explicitly mention the fact that reads and writes to allocas and read from constant globals are memory(none).

Fixes #95152.


Full diff: https://github.com/llvm/llvm-project/pull/95238.diff

1 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+19-2)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index f39b8dc6c90d4..315baad5c6e81 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1598,8 +1598,10 @@ Currently, only the following parameter attributes are defined:
     through this pointer argument (even though it may read from the memory that
     the pointer points to).
 
-    If a function reads from a writeonly pointer argument, the behavior is
-    undefined.
+    This attribute is understood in the same way as the ``memory(write)``
+    attribute. That is, the pointer may still be read as long as the read is
+    not observable outside the function. See the ``memory`` documentation for
+    precise semantics.
 
 ``writable``
     This attribute is only meaningful in conjunction with ``dereferenceable(N)``
@@ -1973,6 +1975,21 @@ example:
     - ``memory(readwrite, argmem: none)``: May access any memory apart from
       argument memory.
 
+    The supported access kinds are:
+
+    - ``readwrite``: Any kind of access to the location is allowed.
+    - ``read``: The location is only read. Writing the location is immediate
+      undefined behavior. This includes the case where the location is read and
+      then the same value is written back.
+    - ``write``: Only writes to the location are observable outside the function
+      call. However, the function may still internally read the location after
+      writing it, as this is not observable. Reading the location prior to
+      writing it results in a poison value.
+    - ``none``: No reads or writes to the location are observed outside the
+      function. It is always valid read and write allocas, and read global
+      constants, even if ``memory(none)`` is used, as these effects are not
+      externally observable.
+
     The supported memory location kinds are:
 
     - ``argmem``: This refers to accesses that are based on pointer arguments

@nunoplopes
Copy link
Member

I understand the motivation, and the text looks good.

But this going to be painful to implement in Alive2 😅 The semantics amounts to quantifying the input memory to a function and ensuring it returns the same value/memory for every input memory. It's UB if not. Well, I guess the read(none) is similar.

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, this makes sense

@nikic nikic force-pushed the langref-writeonly branch from 10c7548 to 484ab8b Compare June 17, 2024 09:15
@nikic nikic merged commit 9cbedd9 into llvm:main Jun 19, 2024
8 checks passed
@nikic nikic deleted the langref-writeonly branch June 19, 2024 07:45
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
Instead of making writes immediate undefined behavior, consider these
attributes in terms of their externally observable effects. We don't
care if a location is read within the function, as long as it has no
impact on observed behavior. In particular, allow:

 * Reading a location after writing it.
 * Reading a location before writing it (within the function) returns a
poison value.

The latter could be further relaxed to also allow things like "reading
the value and then writing it back", but I'm not sure how one would
specify that operationally (so that proof checkers can verify it).

While here, also explicitly mention the fact that reads and writes to
allocas and read from constant globals are `memory(none)`.

Fixes llvm#95152.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MemCpyOpt] Call slot optimization doesn't respect writeonly
5 participants