Skip to content

Clarify that RFC1520 does not permit the compiler to replace calls to Clone::clone with a memcpy #3197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eggyal
Copy link

@eggyal eggyal commented Nov 19, 2021

Following discussion on Zulip referencing a thread on the Rust subreddit and a previous discussion on IRLO, this PR clarifies that RFC1520 only affects the assumptions that users may make about implementations of Clone for Copy types and does not affect the assumptions that the compiler will make when compiling calls to Clone::clone. In particular, the complier is not permitted by this RFC to replace calls to Clone::clone with a memcpy.

I believe that this was always the intention of the RFC from the outset, but the wording left some doubt.

@the8472 the8472 added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Nov 20, 2021
@petrochenkov
Copy link
Contributor

Are built-in derives count as a compiler or as a (proc macro) library here?
IIRC, they do Copy-based optimizations.

@eggyal eggyal force-pushed the clarify-rfc1520-is-library-only branch from 9f26c96 to dde2b71 Compare November 20, 2021 09:01
@eggyal eggyal changed the title Clarify that RFC1520 only relates to libraries Clarify that RFC1520 does not permit the compiler to replace calls to Clone::clone with a memcpy Nov 20, 2021
@eggyal
Copy link
Author

eggyal commented Nov 20, 2021

Are built-in derives count as a compiler or as a (proc macro) library here?
IIRC, they do Copy-based optimizations.

Are you referring to the implementation of #[derive(Clone)]? Is that affected by this RFC? Implementors of Clone for a Copy type have always had the freedom to use memcpy, and indeed that has always been encouraged: this RFC just affects whether users can assume that to be the implementation decision.

I have nevertheless reworded to refer to users instead of libraries, which is probably more correct anyway; and I think that should also cover (any other) built-in derive macros that rely on some Copy type's Clone implementation?

@eggyal eggyal force-pushed the clarify-rfc1520-is-library-only branch from dde2b71 to 363213c Compare November 20, 2021 10:56
@workingjubilee
Copy link
Member

With respect, I do not understand how Ralf's claim can be simultaneously true and not also affect whether it is valid for a user to use memcpy on a Copy data structure.

To overload this to reductio ad absurdum: If unsafe { } is never allowed to reason based on a safe impl, then it is not allowed to reason based on the exposed API surface of, say, integers, which is also safe code, whereas a lot of unsafe code in practice probably relies on basic arithmetic working correctly.

@Gankra
Copy link
Contributor

Gankra commented Nov 23, 2021

This is the subject of the very first page of the Rustonomicon. Excerpting:


No matter what, Safe Rust can't cause Undefined Behavior.

The design of the safe/unsafe split means that there is an asymmetric trust relationship between Safe and Unsafe Rust. Safe Rust inherently has to trust that any Unsafe Rust it touches has been written correctly. On the other hand, Unsafe Rust cannot trust Safe Rust without care.

As an example, Rust has the PartialOrd and Ord traits to differentiate between types which can "just" be compared, and those that provide a "total" ordering (which basically means that comparison behaves reasonably).

BTreeMap doesn't really make sense for partially-ordered types, and so it requires that its keys implement Ord. However, BTreeMap has Unsafe Rust code inside of its implementation. Because it would be unacceptable for a sloppy Ord implementation (which is Safe to write) to cause Undefined Behavior, the Unsafe code in BTreeMap must be written to be robust against Ord implementations which aren't actually total — even though that's the whole point of requiring Ord.

The Unsafe Rust code just can't trust the Safe Rust code to be written correctly. That said, BTreeMap will still behave completely erratically if you feed in values that don't have a total ordering. It just won't ever cause Undefined Behavior.

One may wonder, if BTreeMap cannot trust Ord because it's Safe, why can it trust any Safe code? For instance BTreeMap relies on integers and slices to be implemented correctly. Those are safe too, right?

The difference is one of scope. When BTreeMap relies on integers and slices, it's relying on one very specific implementation. This is a measured risk that can be weighed against the benefit. In this case there's basically zero risk; if integers and slices are broken, everyone is broken. Also, they're maintained by the same people who maintain BTreeMap, so it's easy to keep tabs on them.

On the other hand, BTreeMap's key type is generic. Trusting its Ord implementation means trusting every Ord implementation in the past, present, and future. Here the risk is high: someone somewhere is going to make a mistake and mess up their Ord implementation, or even just straight up lie about providing a total ordering because "it seems to work". When that happens, BTreeMap needs to be prepared.

The same logic applies to trusting a closure that's passed to you to behave correctly.

This problem of unbounded generic trust is the problem that unsafe traits exist to resolve.

.....

The decision of whether to mark a trait unsafe is an API design choice. A safe trait is easier to implement, but any unsafe code that relies on it must defend against incorrect behavior. Marking a trait unsafe shifts this responsibility to the implementor. Rust has traditionally avoided marking traits unsafe because it makes Unsafe Rust pervasive, which isn't desirable.

Send and Sync are marked unsafe because thread safety is a fundamental property that unsafe code can't possibly hope to defend against in the way it could defend against a buggy Ord implementation. Similarly, GlobalAllocator is keeping accounts of all the memory in the program and other things like Box or Vec build on top of it. If it does something weird (giving the same chunk of memory to another request when it is still in use), there's no chance to detect that and do anything about it.

The decision of whether to mark your own traits unsafe depends on the same sort of consideration. If unsafe code can't reasonably expect to defend against a broken implementation of the trait, then marking the trait unsafe is a reasonable choice.

As an aside, while Send and Sync are unsafe traits, they are also automatically implemented for types when such derivations are provably safe to do. Send is automatically derived for all types composed only of values whose types also implement Send. Sync is automatically derived for all types composed only of values whose types also implement Sync. This minimizes the pervasive unsafety of making these two traits unsafe. And not many people are going to implement memory allocators (or use them directly, for that matter).

This is the balance between Safe and Unsafe Rust. The separation is designed to make using Safe Rust as ergonomic as possible, but requires extra effort and care when writing Unsafe Rust. The rest of this book is largely a discussion of the sort of care that must be taken, and what contracts Unsafe Rust must uphold.

@Gankra
Copy link
Contributor

Gankra commented Nov 23, 2021

Now, Copy is a genuinely interesting trait in this regard. In some sense, you could argue that it should be an unsafe trait, for much the same reason Send and Sync should be: it is making a hard guarantee, and an incorrect implementation obviously could cause Undefined Behaviour. For instance you could mark Vec as Copy and that would be unsound.

But I would argue that the situation is a bit different. Send and Sync are marked unsafe and automatically derived when possible. When the compiler detects that your type cannot be automatically Send (e.g. because you contain a non-Send type) you have the option to override that determination and manually unsafe impl the trait!

Copy on the other hand must be manually derived, and the compiler will refuse to allow you to implement Copy when it detects a dubious situation:

error[E0184]: the trait `Copy` may not be implemented for this type; the type has a destructor
 --> src/main.rs:9:1
  |
9 | impl Copy for Foo {}
  | ^^^^^^^^^^^^^^^^^^^^ Copy not allowed on types with destructors

So you can't compose a bunch of safe code and toss in an impl Copy and get something unsound. And for most safe-abstractions-wrapping-unsafe-code, the compiler will hard reject an implementation of Copy on the grounds that you wrote a destructor for this type (so you cannot for instance mark Vec as Copy).

So what is an unsound impl of Copy? It would be a type that doesn't have a destructor but which otherwise has properties that are trusted by unsafe code, and which would be violated by duplicating without any additional book-keeping. For instance, a phantom token to validate a sequence of events:

fn step1() -> Step1Token;
fn step2(Step1Token);

If step2 contains unsafe code that requires step1 to have been run beforehand, Copying Step1Token would be unsound...

But so would Cloneing it. So would a Default implementation. So would pub fn new() -> Step1Token.

This is the exact same situation as unsafe code in Vec needing to rely on field privacy for correctness: An unsafe code author exposing a safe interface must wholistically design that interface for correctness. If you make the fields of Vec pub, it's not a flaw in pub that I can set vec.capacity=99999 -- it's a flaw in your design.

Similarly, if you make a key type in your design Copy when it shouldn't be, that's not a flaw in Copy, that's a flaw in your unsafe design. And if you're not writing unsafe I see no way for an incorrect Copy implementation to cause memory safety problems.

@workingjubilee
Copy link
Member

Mmm, I understand that but I am not as confident about that specifically applying here because:

  • Copy functionally "makes a statement" that Clone is equivalent to a bitwise copy,
  • and this RFC specifically doubled down and asserted that users can reason about it thusly.
  • So my question is: how is the compiler not "simply" an "exceedingly angelic user"?

That may sound deranged, but given we're functionally asking the compiler to look at our code and rewrite it to a better implementation, we need to ask that question... especially if we (or, say, LLVM, GCC, or Cranelift, or any other codegen backend) are also implementing optimizations based on what reasoning we can normally apply to Rust.

"We have met the compiler, and it is us."

@CryZe
Copy link

CryZe commented Nov 23, 2021

The reason probably is because it has been allowed thus far to have custom Clone impls, so it would be surprising if the compiler suddenly wouldn't start calling them anymore. But is this really something that needs to be kept? Maybe in Rust 2024 or so we could just have Copy prevent a manual Clone impl and always allow the compiler to optimize the call into a memcpy. Unless there's a good reason to allow manual Clone impls, but I currently don't see it.

@Gankra
Copy link
Contributor

Gankra commented Nov 23, 2021

So I agree that if you write a type with a non-trivial Clone that is still Copy, you are doing something Very Dubious and Probably Unsound. But you could certainly contrive situations where it's "basically fine". For instance you could add logging statements to your Clone impl to try to get debug something, while keeping the Clone otherwise trivial.

In this case it may be impossible to remove the Copy implementation since random pieces of code are relying on it, but you "know" it will be Actually Clone'd in some place you're interested in. Now it would be frustating but understandable if e.g. Vec::Clone used specialization to skip your logging Clones -- you told it that it could. But it would be a whole other level of cursed if it was literally impossible to execute your logging clones because the compiler kept removing them.

So I think this is less of a soundness thing and more of a "this will be really cursed, we refuse to do this" thing.

(You could contrive some example where Copying is fine but you're relying on a non-trivial Clone Definitely Happening in a specific place, but, that's playing with a lot of Fire and even I'm not willing to endorse that level of evil.)

@CryZe
Copy link

CryZe commented Nov 23, 2021

For instance you could add logging statements to your Clone impl to try to get debug something, while keeping the Clone otherwise trivial.

Yeah I was actually about to bring up logging too... but yeah it's not even guaranteed that anyone even explicitly calls .clone() on your type, so you are missing like half the logging anyway, so you are much better off with a non-Copy newtype wrapper that does the logging consistently instead. So not only does this prevent optimizations, but it also is just a big footgun if you actually do implement Clone, that honestly at least clippy should lint against it.

So I'm still under the opinion that there's no good use case for this and at least Rust 2024 should just not allow manual Clone impls at all (and always optimize).

@workingjubilee
Copy link
Member

I would hesitate to block manually implementing Clone where Copy is implemented simply because that would interfere with cases which have very nuanced where-clause bounds on the impl.

@CryZe
Copy link

CryZe commented Nov 23, 2021

I mean the compiler could just "auto implement it" and the trait resolver will realize there's 2 conflicting implementations and then fail compiling.

@Diggsey
Copy link
Contributor

Diggsey commented Nov 23, 2021

Also, if your Clone implementation really is equivalent to a memcpy, and its implementation is visible to the current compilation step, then of course the compiler is free to replace it with a memcpy regardless, just via the normal optimization passes.

So the only place this makes a difference is when the Clone implementation actually does something different from Copy, or the compiler can't see that (ie. because it's in a different crate, is not generic or marked inline, and link-time code generation is not being used).

@eggyal
Copy link
Author

eggyal commented Nov 23, 2021

My thinking behind this PR was that it cannot be satisfactory for the compiler to be free to perform this optimisation but make no guarantee over whether it will: surely its behaviour needs to be certain one way or the other?

Yet to guarantee that it will perform this optimisation is a significant change that I imagine requires the approval of at very least the lang team, and possibly also the compiler team? It would also be a breaking change that I presume would not be admitted into stable until a new edition.

My conclusion was therefore that, for the time being, the compiler surely guarantees that it won't perform this optimisation? This PR merely clarifies that status quo in (what I considered to be) an otherwise ambiguous RFC.

I guess the question of whether a future edition should handle things differently can be deferred (and indeed I see that Niko historically blogged various thoughts around how specialisation could enable libcore to provide a blanket impl<T> Clone for T where T: Copy, so that might ultimately render the problem moot anyway).

@RalfJung
Copy link
Member

RalfJung commented Nov 25, 2021

I do not understand how Ralf's claim can be simultaneously true and not also affect whether it is valid for a user to use memcpy on a Copy data structure.

Copy just means that if you do a bytewise copy of a value of this type, then afterwards both the original value and the newly created copy satisfy the safety invariant of this type. (Safety invariants can claim "ownership", so they are not inherently "duplicable": Just because the invariant holds for a sequence of bytes, doesn't mean it holds separately for that same sequence of bytes located in two different places.)

This is a core safety contract, akin to the contract of Send or Sync. Copy is a safe trait for other reasons, as @Gankra elaborated.

There is absolutely no logical connection between a type being Copy, and anything that has to do with the Clone trait. In particular, the above property of being Copy is totally compatible with Clone::clone printing something to stdout, and if that is the case and the user calls clone, then this program must print to stdout. At least I can see no way to logically derive from "type T is Copy" the fact that "it"s okay to assume T::clone will never print" (and this is the kind of derivation it would take to give the compiler license to do such a replacement).

Put differently: the idea that Clone::clone is equivalent to a memcpy is exactly the same kind of idea which says that PartialEq must be symmetric, so eq(x, y) is equivalent eq(y, x). Users are allowed to assume that equivalence, but not in a way that could introduce unsafety. The compiler doing a replacement falls in the same category as "relying on this property in a way that could introduce unsafety". The same is true for clone.

IOW, the following function is not sound (and AFAIK there is widespread consensus for that):

fn test_eq<T: PartialEq>(x: &T, y: &T) {
  if (x == y) != (y == x) {
    unsafe { unreachable_unchecked(); }
  }
}

For the same reason, it is not sound for the compiler to replace x == y by y == x. After all, the following function is sound:

fn test_eq<T: PartialEq>(x: &T, y: &T) {
  let eq = (x == y);
  if eq != eq {
    unsafe { unreachable_unchecked(); }
  }
}

If the compiler could assume that PartialEq::eq is pure and symmetric, it could transform the latter program into the former, which would introduce UB into a UB-free program. The situation is the same with replacing clone by memcpy.

@danielhenrymantilla
Copy link

danielhenrymantilla commented Nov 25, 2021

So my question is: how is the compiler not "simply" an "exceedingly angelic user"?

I think the answer is that it should always be possible for somebody to unambiguously call an impl they've just written: any explicit call of <YourType as Clone>::clone() (or, equivalently, <GenericTyThatHappensToBeInstanceWithYourType as Clone>>clone()) ought to call function defined in impl Clone for YourType. There is no instance in all of Rust that breaks this property.

Now, you may consider that this property, at least for the specific case of Clone::clone, may not be that important, compared to the missed optimizations of not applying k#clone x optimizations (I'm using this syntax for a hypothetical built-in that would, on the contrary, instruct the compiler that it would be allowed to perform a Copy rather than a Clone when applicable):

  • So, one the one hand, the position that if somebody writes impl<…> Clone for MyType<…> { … }, it should be possible for people to call the logic in that impl, even when MyType<…> happens to be Copy.

  • On the other hand, the position that it should be possible to write <T : Clone>-generic code that happens to be Copy-optimized when applicable.

What way to go?

  • the caveats of going with the former, is that specialization is required for the latter, which means having to wait if on stable Rust, or fall back to poor man's specialization (aside: with poor man's specialization it is already possible for user code to provide a #[derive(Clone)] proc-macro implementation which is optimized for the Copy case; cc @petrochenkov).

  • the caveats of favoring the latter would be to have broken that keystone in Rust of "calling a trait's method for some type will necessarily call the function written in the impl of that type for that trait".

Those are the facts.

Imho, the caveats of "having to use specialization" to elide the Clone (as a lib, thus) seem to be smaller than the others. Hence the stance made officially public by this very PR, I'd say.

@ijackson
Copy link

If only we had decided a long time ago that the compiler was entitled to remove clone/drop pairs.

@RalfJung
Copy link
Member

If only we had decided a long time ago that the compiler was entitled to remove clone/drop pairs.

I think that would have been a mistake -- I think the basic property that when you call a function, that function actually gets called, is too important to give up on. (This would be in a sense 'worse' than C++ copy elision, where -- to my knowledge -- there is a guarantee that the function [the copy constructor] will not be called in certain syntactic forms. In a sense that is just "weird syntax". But here we'd be talking about the compiler deciding to call the function or not, depending on arbitrary factors that can change with every rebuild.)

@comex
Copy link

comex commented Nov 29, 2021

(This would be in a sense 'worse' than C++ copy elision, where -- to my knowledge -- there is a guarantee that the function [the copy constructor] will not be called in certain syntactic forms. In a sense that is just "weird syntax". But here we'd be talking about the compiler deciding to call the function or not, depending on arbitrary factors that can change with every rebuild.)

Your expectations of C++ are too high ;)

It actually has both mandatory and non-mandatory copy elision in different situations. (Don’t be fooled by that page describing the latter as an “optimization”; it’s not just talking about standard compiler optimizations that might happen to remove a copy if it has no observable side effects. It really is a rule that the compiler decides whether to call the function or not.)

@dlight
Copy link

dlight commented Dec 2, 2021

If only we had decided a long time ago that the compiler was entitled to remove clone/drop pairs.

I think that would have been a mistake -- I think the basic property that when you call a function, that function actually gets called, is too important to give up on.

Then maybe the mistake was allowing a manual Clone impl on Copy types? The way the docs are worded make me think that the only valid manual impl is essentially equivalent to the automatically derived impl anyway.

Types that are Copy should have a trivial implementation of Clone. More formally: if T: Copy, x: T, and y: &T, then let x = y.clone(); is equivalent to let x = *y;. Manual implementations should be careful to uphold this invariant; however, unsafe code must not rely on it to ensure memory safety.

If, back then, manual Clone impls were forbidden for Copy types, then unsafe code could have relied on this property.

@Diggsey
Copy link
Contributor

Diggsey commented Dec 2, 2021

@dlight I don't think there's actually a problem here - this issue is about the compiler being able to elide explicit calls to Clone.

As for unsafe code: unsafe code is allowed to do anything safe code can do (obviously) so if unsafe code knows that a type is Copy, then it can do a memcpy, regardless of if there is a different Clone implementation - the only time this would be questionable is if the interface to the unsafe code did not expose the fact that it relies on a Copy bound or if it makes some explicit claim about calling Clone which is not upheld.

@eggyal eggyal force-pushed the clarify-rfc1520-is-library-only branch 2 times, most recently from 8e76b0e to 50571f8 Compare December 2, 2021 13:33
@eggyal eggyal force-pushed the clarify-rfc1520-is-library-only branch from 50571f8 to 9aeaf42 Compare December 2, 2021 13:38
@workingjubilee
Copy link
Member

I think my questions have been satisfied. My understanding is now:

  • That I still think the "common sensical" inclination of Rust programmers (whether or not it truly aligns with the abstract machine and the burdens the compiler are under) is to think of Copy as "superseding" Clone, and that programmers would prefer that the compiler elide Clone::clone more often, not less.
  • However, the compiler, being in fact exceedingly angelic, also has the power to gaze inside an impl of Clone::clone in order to verify rewriting the function call into a simple bit-copy is reasonable, so the alternative of eliding a clone without doing so is actually, when considering the full knowledge of the compiler, equivalent to avoiding executing an otherwise correct program to its conclusion, and thus would turn angelic powers to demonic purposes. The compiled code might go faster, but it risks violating any more-carefully considered relationship of effects.
  • So, as programmers are likely to find this unintuitive either way, we might as well choose the version that does not compromise correctness, and in practice doesn't actually hinder the compiler in the usual case of #[derive(Clone,Copy)], nor does it hinder programmers actually writing the optimization themselves ever.
  • Nonetheless, anyone who does explicitly impl Clone for CopyableType in a way that doesn't just copy the data type should consider their efforts to be futile in the "anyone using this type can always bypass Clone::clone, so why are you bothering?" sense... but it is useful for some niche debugging purposes, and we can't truly stop anyone from writing silly code.
  • ...For now, anyways.

@Aaron1011
Copy link
Member

Aaron1011 commented Jan 28, 2022

Related: A while back, I experimented with a transformation that replaced a 'transitive Clone impl' with a bitwise Copy impl (using CloneShim): rust-lang/rust#72409. At the time, it didn't result in any significant performance improvements on our benchmark suite.

Nonetheless, anyone who does explicitly impl Clone for CopyableType in a way that doesn't just copy the data type should consider their efforts to be futile in the "anyone using this type can always bypass Clone::clone, so why are you bothering?" sense... but it is useful for some niche debugging purposes, and we can't truly stop anyone from writing silly code.

I think it would be useful to have a warn-by-default lint against such impls, to catch cases where the user may not have intended that combination (e.g. a type starts out with a custom Clone impl, and then later gets a Copy impl added).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.