What exactly is going on with return places and the LHS of assignments?

This question is somewhat tied up in at least
- https://github.com/rust-lang/rust/issues/71117
- https://github.com/rust-lang/rust/issues/68364

but those issues also talk about `move` a lot which I think is mostly orthogonal (and tracked in https://github.com/rust-lang/unsafe-code-guidelines/issues/416).

For function calls `ret = call(args)`, in Miri we currently do something special:
The return place designated by the caller is retagged with a fresh unique (and protected) tag upon function entry.
A fresh place is allocated for the callee and used for the `_0` MIR local during function execution.
When the function is done, the value is copied from `_0` to the caller-provided return place.
The retag means that any access to the caller-provided return place is UB for the duration of the call.
This is useful to explain compilation strategies where actually no fresh memory is allocated for the callee, and `_0` directly points to the caller-provided return place.

For MIR assignments, a [similar semantics](https://github.com/rust-lang/rust/issues/68364#issuecomment-614862820) is possible, and is intended to explain uses of `memcpy` that require the LHS and RHS to not overlap. (Miri currently does explain this in a somewhat subtle way, where the "fallback" path of copying non-Scalar/ScalarPair values uses a `mem_copy` that raises UB on overlaps.)

However, for function calls with custom MIR these semantics are actually insufficient: as usual with aliasing model tricks, this fails to account for pointer comparisons! If someone were to compare the address of the return place in the callee with the caller-provided return place, they would always come out inequal in the model but could be equal in real codegen.

We should at least come up with a spec explaining today's codegen (and implement it in Miri). I am not sure if a tighter spec is required for planned future changes.

So do we need something completely different? For function calls, if the return place is just a local, one could imagine something more extreme where that local is actually deallocated first, then a callee return place is allocated, then the function runs, then we load the return value, deallocate the callee return place, allocate again the caller return place, and store the return value in there. This is very symmetric with [some of the `move` semantics proposals](https://github.com/rust-lang/unsafe-code-guidelines/issues/416). But it only works when the caller return place can be deallocated and it means all pointers to the caller return place become invalid, so that seems too extreme...

Another rather crude option would be to just pick non-deterministically whether the callee gets a fresh return place or a direct pointer to the caller-provided return place. This would be fine to explain what codegen does. We have to be careful with optimizations exploiting this though; once an optimization assumed that one choice or the other was made, we have to ensure codegen is consistent with that! We probably need the aliasing tricks (to explain that in the callee, the return place is treated like a `noalias` argument) *and* this kind of non-determinism?

For assignments, I don't think the pointer equality concern applies, so the aliasing might be enough?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What exactly is going on with return places and the LHS of assignments? #417

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What exactly is going on with return places and the LHS of assignments? #417

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions