[LangRef] Mention allocation elision#177592
Conversation
|
@llvm/pr-subscribers-llvm-ir Author: Nikita Popov (nikic) Changesallockind / alloc-family enable allocation elision, but this was not previously mentioned by LangRef. Related discussion: https://discourse.llvm.org/t/rfc-clarifying-semantic-assumptions-for-custom-allocators/89469 cc @RalfJung I've documented this in terms of optimization, but if desired I could define this more operationally with non-determinism. Full diff: https://github.com/llvm/llvm-project/pull/177592.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 103058d161f86..e77aab481c660 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -2077,6 +2077,15 @@ For example:
The first three options are mutually exclusive, and the remaining options
describe more details of how the function behaves. The remaining options
are invalid for "free"-type functions.
+
+ Calls to functions annotated with ``allockind`` are subject to allocation
+ elision: Calls to allocator functions can be removed, and the allocation
+ served from a virtual allocator instead. Notably, this is allowed even if
+ the allocator calls have side-effects.
+
+ If multiple allocation functions operate on the same allocation (for
+ example, an "alloc" followed by "free"), allocation elision is only allowed
+ if all involved functions have the same ``"alloc-family"``.
``"alloc-variant-zeroed"="FUNCTION"``
This attribute indicates that another function is equivalent to an allocator function,
but returns zeroed memory. The function must have "zeroed" allocation behavior,
|
It will not surprise you that I would prefer a more operational specification. ;) |
|
|
||
| If multiple allocation functions operate on the same allocation (for | ||
| example, an "alloc" followed by "free"), allocation elision is only allowed | ||
| if all involved functions have the same ``"alloc-family"``. |
There was a problem hiding this comment.
I also presume that elision will then replace either all or none of the operations?
There was a problem hiding this comment.
So I initially added wording for that, but then realized that this is somewhat misleading. It's not really all or nothing, but rather "pairwise". For example, if we have an alloc + realloc + free, we can convert that into alloc + free (https://c.godbolt.org/z/Ghc6e9de8). I think we should also be allowed to convert alloc + realloc into alloc (though I'm not sure LLVM does that one right now).
So it's basically that we can elide either a leaked alloc, or an alloc + free pair where realloc is considered a combined alloc+free. For the sake of clarity, I ended up just writing out the possible cases.
Does that sound reasonable?
There was a problem hiding this comment.
For example, if we have an alloc + realloc + free, we can convert that into alloc + free
That remaining alloc call then has the size of the realloc call, I presume? (Assuming that realloc is used to grow the allocation.)
That means the final actual call made to the allocator does not fully correspond to a call in the source program. We could try to come up with an operational approach that precisely captures this, but I wonder if it's not easier or even more accurate to say that LLVM is allowed to synthesize new calls to the allocator following the usual allocator contract?
There was a problem hiding this comment.
This makes sense, with one caveat: is it correct that we can only turn alloc + realloc + free into alloc + free when the result of realloc is not used and its effects are not visible?
There was a problem hiding this comment.
That remaining alloc call then has the size of the realloc call, I presume? (Assuming that realloc is used to grow the allocation.)
Heh, that depends on which pair we elide :) Either the alloc + realloc -> alloc, in which case the alloc has the size of the realloc, or realloc + free -> free, in which case it has the original size.
Of course, a separate sensible optimization in this space would be to shrink the allocation size if it's unused, though as far as I know LLVM doesn't do this.
That means the final actual call made to the allocator does not fully correspond to a call in the source program. We could try to come up with an operational approach that precisely captures this, but I wonder if it's not easier or even more accurate to say that LLVM is allowed to synthesize new calls to the allocator following the usual allocator contract?
Hm, I think that allowing creation of allocator calls out of thin air would be problematic for various reasons. For example, C malloc is not async signal safe, so synthesizing such a call inside a signal handler would be illegal.
Generally this seems a bit tough to specify operationally, due to the pairwise requirement. At the time of the allocator call, we don't yet know whether there is going to be a free call with matching alloc-family in the future or not.
There was a problem hiding this comment.
So the case you have in mind is something like this?
define void @test() {
%a = call ptr @alloc() allockind(alloc) "alloc-family"="foo"
%c = load i1, ptr @was.allocated
br i1 %c, label %if, label %else
if:
call ptr @dealloc(ptr allocptr %a) allockind(free) "alloc-family"="foo"
ret void
else:
call ptr @inlined_dealloc(ptr allocptr %a) allockind(free) "alloc-family"="bar"
ret void
}I don't think there needs to be a logical contradiction here. That depends on how you model it.
For example, if at the @alloc() call, we fork into two executions where:
- In one, the allocator is called,
@was.allocatedis true, we go into theifbranch, and call@dealloc(), everything is fine. - In the other, we replace with a virtual allocation,
@was.allocatedis false, we go into theelsebranch and call@inlined_dealloc(). As we replaced the original allocation in"alloc-family"="foo"and we now perform the dealloc call with a mismatching"alloc-family"="bar", we discard this execution as invalid.
The remaining execution is 1, so in this case we have one possible behavior, which is to not elide.
your proposal
FWIW, I'm not really proposing anything here, I'm just documenting existing behavior. I think it is better for us to document something, even if it is non-operational, than to just ignore it entirely.
I think the current wording here strikes the right balance between describing how this actually works, and how you would model this operationally at higher levels (like Miri) that do not have to deal with our peculiar partial inlining constraints.
There was a problem hiding this comment.
One more thing that's probably worth mentioning here is that the modeling described in my previous comment, while angelic, is "less" angelic than the inttoptr exposed provenance synthesis case. In particular I think it does not have the problematic interaction with demonic choice (though correct me if I'm wrong on that).
If we take this variant of the test case:
define void @test() {
%c = freeze i1 poison
%a = call ptr @alloc() allockind(alloc) "alloc-family"="foo"
br i1 %c, label %if, label %else
if:
call ptr @dealloc(ptr allocptr %a) allockind(free) "alloc-family"="foo"
ret void
else:
call ptr @inlined_dealloc(ptr allocptr %a) allockind(free) "alloc-family"="bar"
ret void
}Then moving the demonic freeze i1 poison below the allocation does not change the set of legal executions, unlike in the inttoptr case. That's because here there is a specific criterion for which executions are legal, rather than the criterion being "one of the executions that doesn't cause UB".
Edit: I think this specific example doesn't quite illustrate the point I'm trying to make...
There was a problem hiding this comment.
we discard this execution as invalid.
This is known as "no-behavior" (NB) and it is comparabe to angelic choice. In particular, demonic choice + no-behavior can model prophecy variables, i.e., predicting what will happen in future parts of the program (we could have a variable %b where in all "valid" executions, the value of %b is equal to something that we read from stdin later during execution: take a guess, and later if the guess was wrong, trigger NB). NB makes the semantics non-executable. NB makes reordering difficult: "NB; UB" cannot be reordered to "UB; NB" (so, potentially-NB operations have to be treated like potentially-diverging operations).
NB would basically mean that the only way to know for sure that a trace is part of the possible set of behaviors of a program is to run the program all the way to completion -- if the program never terminates (which is expected for things like servers), you can never tell if any behavior you see is actually "real". Consider that there could have been side-effects between the alloc and the free and those can differ between the elided and non-elided case; we'd basically have to "take back" side-effects that already happened if the program later reaches NB.
Another odd point is that the else branch is dead code in the sense that no "valid" execution ever goes there, and yet it cannot be replaced by unreachable.
while angelic, is "less" angelic than the inttoptr exposed provenance synthesis case
Oh yeah, the angelic inttoptr model does not actually "work" as an executable model. I view it more as a guideline to compare the actual, to-be-developed model with. OTOH the inttoptr model at least only affects abstract state the program cannot directly observe (provenance) where the NB model for allocation could end up with a program that prints to stdout only to then go "jk that execution never happened".
FWIW, I'm not really proposing anything here, I'm just documenting existing behavior. I think it is better for us to document something, even if it is non-operational, than to just ignore it entirely.
I can agree with that. :) Seems fine to document something based on transformations for now (a strict improvement over the status quo), and open an issue to track that we don't really know what this means operationally and whether it is formally consistent with everything else.
There was a problem hiding this comment.
Thanks, those are all good points. Do you have any references on the "no-behavior" concept? I'm pretty sure I've seen this before (maybe in the context of move elim?) but it's hard to find any references for NB and now it interacts with other semantics.
There was a problem hiding this comment.
It has come up in a few papers, e.g. https://sf.snu.ac.kr/publications/ccr.pdf and https://iris-project.org/pdfs/2023-popl-dimsum.pdf. But I don't know a canonical citation.
NB is basically the natural interpretation of what it means to do demonic non-deterministic choice over the empty set of possible options. (This is dual to how UB is angelic choice over the empty set.) In the study of non-determinism in general (without giving it a demonic/angelic interpretation), the "empty set of choices" has existed since ~forever; for example, it is the typical way to model doomed branches of a backtracking search that is encoded via non-deterministic exploration.
|
LGTM. |
|
LGTM. @nikic |
| * An "alloc" and "free" pair can be elided. | ||
| * A "realloc" and "free" pair can be converted into a "free" of the original | ||
| allocation. | ||
| * An "alloc" and "realloc" pair can be converted into an "alloc". |
There was a problem hiding this comment.
It's not always clear here where the conversion happens. For instance, when alloc+realloc are turned into alloc, is that put in the place of the original alloc (with the max of the sizes), or the original realloc (with some stack memory being used for the time between the original alloc and realloc)?
|
I'm not sure what specification we should use in Alive2 though. Allocations functions can have side-effects and change errno. To allow deletion, we need to model all side-effects as being non-deterministic. @RalfJung does that sound right? |
|
Yeah I would model a call to |
|
sounds, thank you! 🙏 |
allockind / alloc-family enable allocation elision, but this was not previously mentioned by LangRef. Related discussion: https://discourse.llvm.org/t/rfc-clarifying-semantic-assumptions-for-custom-allocators/89469
abceaf2 to
5a305f7
Compare
|
I've added some explicit wording on non-determinism. |
| served from a virtual allocator instead. Notably, this is allowed even if | ||
| the allocator calls have side-effects. In other words, for each allocation | ||
| there there is a non-deterministic choice between calling the allocator as | ||
| usual, or using a virtual, side-effect-free allocator instead. |
There was a problem hiding this comment.
Sorry, possibly missing some context, just wondering, am I understanding correctly that this, slightly rephrased, may imply that we substantially do not care whether there are side-effects or not for the purposes of above? That either a side-effect and a side-effect-free allocation call is fine?
There was a problem hiding this comment.
Yes, in a specific sense. It is okay to elide all of the allocator side effects. But if the allocation is not elided, we also can't ignore the side effects.
|
|
||
| Calls to functions annotated with ``allockind`` are subject to allocation | ||
| elision: Calls to allocator functions can be removed, and the allocation | ||
| served from a virtual allocator instead. Notably, this is allowed even if |
There was a problem hiding this comment.
| served from a virtual allocator instead. Notably, this is allowed even if | |
| served from a "virtual"/"built-in" allocator instead. Notably, this is allowed even if |
I think scare quotes are appropriate here to indicate that this is an abstract concept, not a real allocator.
RalfJung
left a comment
There was a problem hiding this comment.
As mentioned in a comment already:
Seems fine to document something based on transformations for now (a strict improvement over the status quo), and open an issue to track that we don't really know what this means operationally and whether it is formally consistent with everything else.
|
I've filed #184102 to track this. |
allockind / alloc-family enable allocation elision, but this was not previously mentioned by LangRef. Related discussion: https://discourse.llvm.org/t/rfc-clarifying-semantic-assumptions-for-custom-allocators/89469 The semantics here are specified in terms of allowed transforms. Making the semantics operational is tracked in llvm#184102.
allockind / alloc-family enable allocation elision, but this was not previously mentioned by LangRef. Related discussion: https://discourse.llvm.org/t/rfc-clarifying-semantic-assumptions-for-custom-allocators/89469 The semantics here are specified in terms of allowed transforms. Making the semantics operational is tracked in llvm#184102.
allockind / alloc-family enable allocation elision, but this was not previously mentioned by LangRef.
Related discussion: https://discourse.llvm.org/t/rfc-clarifying-semantic-assumptions-for-custom-allocators/89469
cc @RalfJung
I've documented this in terms of optimization, but if desired I could define this more operationally with non-determinism.