Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lint against &T to &mut T and &T to &UnsafeCell<T> transmutes #128351

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ChayimFriedman2
Copy link
Contributor

@ChayimFriedman2 ChayimFriedman2 commented Jul 29, 2024

Conversion from & to &mut are and always were immediate UB, and we already lint against them, but until now the lint did not catch the case were the reference was in a field.

Conversion from & to &UnsafeCell is more nuanced: Stacked Borrows makes it immediate UB, but in Tree Borrows it is sound.

However, even in Tree Borrows it is UB to write into that reference (if the original value was Freeze). In all cases crater found where the lint triggered, the reference was written into.

Lints (mutable_transmutes existed before):

The mutable_transmutes lint catches transmuting from &T to &mut T because it is undefined behavior.

Example

unsafe {
    let y = std::mem::transmute::<&i32, &mut i32>(&5);
}

Produces:

error: transmuting &T to &mut T is undefined behavior, even if the reference is unused, consider instead using an UnsafeCell
 --> .\example.rs:5:17
  |
5 |         let y = std::mem::transmute::<&i32, &mut i32>(&5);
  |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: transmute from `&i32` to `&mut i32`
  = note: `#[deny(mutable_transmutes)]` on by default

Explanation

Certain assumptions are made about aliasing of data, and this transmute
violates those assumptions. Consider using UnsafeCell instead.


The unsafe_cell_transmutes lint catches transmuting or casting from &T to &UnsafeCell<T>
because it dangerous and might be undefined behavior.

Example

use std::cell::Cell;

unsafe {
    let x = 5_i32;
    let y = std::mem::transmute::<&i32, &Cell<i32>>(&x);
    y.set(6);

    let z = &*(&x as *const i32 as *const Cell<i32>);
    z.set(7);
}

Produces:

error: transmuting &T to &UnsafeCell<T> is error-prone, rarely intentional and may cause undefined behavior
 --> .\example.rs:6:17
  |
6 |         let y = std::mem::transmute::<&i32, &Cell<i32>>(&x);
  |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: transmute from `*&i32` to `(*&Cell<i32>).value`
  = note: `#[deny(unsafe_cell_transmutes)]` on by default

error: transmuting &T to &UnsafeCell<T> is error-prone, rarely intentional and may cause undefined behavior
 --> .\example.rs:9:17
  |
9 |         let z = &*(&x as *const i32 as *const Cell<i32>);
  |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: transmute from `i32` to `Cell<i32>.value`

Explanation

Conversion from &T to &UnsafeCell<T> might be immediate undefined behavior, depending on
unspecified details of the aliasing model.

Even if it is not, writing to it will be undefined behavior if there was no UnsafeCell in
the original T, and even if there was, it might be undefined behavior (again, depending
on unspecified details of the aliasing model).

It is also highly dangerous and error-prone, and unlikely to be useful.

Crater summary is below.

cc #111229

r? compiler

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 29, 2024
@rustbot
Copy link
Collaborator

rustbot commented Jul 29, 2024

The Miri subtree was changed

cc @rust-lang/miri

@oli-obk

This comment was marked as resolved.

@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from f27dba0 to d20e89c Compare July 29, 2024 15:58
@oli-obk
Copy link
Contributor

oli-obk commented Jul 29, 2024

@bors try

@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from d20e89c to 5c1d66e Compare July 29, 2024 16:01
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 29, 2024
@bors
Copy link
Contributor

bors commented Jul 29, 2024

⌛ Testing commit 5c1d66e with merge 94971c2...

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Jul 29, 2024

💔 Test failed - checks-actions

@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from 5c1d66e to b62dc4c Compare July 29, 2024 21:06
@rust-log-analyzer

This comment has been minimized.

@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from b62dc4c to 59c22c3 Compare July 29, 2024 21:44
@oli-obk
Copy link
Contributor

oli-obk commented Jul 29, 2024

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 29, 2024
…e-cell, r=<try>

[crater] Lint against &T to &mut T and &T to &UnsafeCell<T> transmutes

Needs a (check-only) crater run as per https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Lint.20against.20.60.26.60-.3E.60.26UnsafeCell.60.20transmutes/near/454868964.

r? ghost
@bors
Copy link
Contributor

bors commented Jul 29, 2024

⌛ Trying commit 59c22c3 with merge be92fb7...

@rust-log-analyzer

This comment has been minimized.

@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from 59c22c3 to cc72a92 Compare July 29, 2024 23:10
@ChayimFriedman2

This comment was marked as resolved.

@tgross35
Copy link
Contributor

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 30, 2024
…e-cell, r=<try>

[crater] Lint against &T to &mut T and &T to &UnsafeCell<T> transmutes

Needs a (check-only) crater run as per https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Lint.20against.20.60.26.60-.3E.60.26UnsafeCell.60.20transmutes/near/454868964.

r? ghost
@bors
Copy link
Contributor

bors commented Jul 30, 2024

⌛ Trying commit cc72a92 with merge 0914651...

@bors
Copy link
Contributor

bors commented Jul 30, 2024

☀️ Try build successful - checks-actions
Build commit: 0914651 (09146510cdb33d15474a35fde290187a88b1cf35)

@oli-obk
Copy link
Contributor

oli-obk commented Jul 30, 2024

@craterbot check

@craterbot
Copy link
Collaborator

👌 Experiment pr-128351 created and queued.
🤖 Automatically detected try build 0914651
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 30, 2024
@craterbot
Copy link
Collaborator

🚧 Experiment pr-128351 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@rust-log-analyzer

This comment has been minimized.

@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from 573ee3f to 2ea0578 Compare December 7, 2024 21:17
@nnethercote
Copy link
Contributor

@oli-obk: would you be willing to review? This is covering lots of territory I don't know well, and I imagine there are subtleties that I could easily overlook.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 9, 2024

😭 that's a lot of new code to review

@oli-obk oli-obk self-assigned this Dec 9, 2024
Copy link
Contributor

@nnethercote nnethercote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it's heading in the right direction.

My main observation is about terminology. To me:

  • "casting" is a specific thing, done via as or cast
  • "transmuting" is a different specific thing, done only via mem::transmute
  • "converting" is an imprecise term that possibly covers both "casting" and "transmuting".

Is that correct? It aligns with the Type Conversions chapter of the Rustonomicon. But this PR tends to use "cast" and "transmute" imprecisely and somewhat interchangeably.

  • I think mutable_transmutes only triggers on mem::transmute, but unsafe_cell_transmutes triggers on both mem::transmute and casting.
  • The crater result mentions unsafe_cell_reference_casting. Is that an old name for unsafe_cell_transmutes?
  • The error message only mention transmuting.
  • The lint descriptions (in doc comments) mention both transmuting and casting in a way that matches my understanding.
  • is_type_cast seems to cover both casting and transmuting.

I feel like this needs to be clarified, and the whole commit needs a once-over to ensure these terms are being used precisely and consistently. Should unsafe_cell_transmutes should be renamed unsafe_cell_conversions?


Beyond that, mutable_transmutes.rs has most of the new code. I reviewed it and it seems reasonable, but that kind of traversal is not my forte so it's possible I missed stuff. All the breadcrumb code is quite complicated, and I do I wonder if something like that already exists elsewhere in the compiler for some other kind of error message.

tests/ui/transmute/transmute-imut-to-mut.rs Show resolved Hide resolved
@@ -0,0 +1,163 @@
use std::cell::UnsafeCell;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests/ui/lint/mutable_transmutes directory name is slightly misleading because it contains tests that cover both mutable_transmutes and unsafe_cell_transmutes. I'm not sure how to fix it, though, and whether it's worth fixing.

#[derive(LintDiagnostic)]
#[diag(lint_builtin_mutable_transmutes)]
pub(crate) struct BuiltinMutablesTransmutes;
#[note]
pub(crate) struct BuiltinMutablesTransmutes<'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this one have a Builtin prefix but UnsafeCellTransmutes does not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the Builtin prefix means (some lints include it while some don't), so I just randomly chose not to include it for the new lint, but not change the existing one.

compiler/rustc_lint/src/lints.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/mutable_transmutes.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/mutable_transmutes.rs Show resolved Hide resolved
compiler/rustc_lint/src/mutable_transmutes.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/mutable_transmutes.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/mutable_transmutes.rs Show resolved Hide resolved
compiler/rustc_lint/src/mutable_transmutes.rs Outdated Show resolved Hide resolved
@nnethercote nnethercote added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 10, 2024
@traviscross
Copy link
Contributor

My main observation is about terminology. To me:

  • "casting" is a specific thing, done via as or cast
  • "transmuting" is a different specific thing, done only via mem::transmute
  • "converting" is an imprecise term that possibly covers both "casting" and "transmuting".

Is that correct?

I'd take a somewhat more broad view of transmuting. Probably I think of that as essentially synonymous with reinterpreting some sequence of bytes as some other type without the presence of any automatic mechanism to reject invalid reinterpretations or to adjust the bytes as needed.

So whether we do that with transmute or with union or by casting pointers and then dereferencing them, it's still a transmutation.

@nnethercote
Copy link
Contributor

I'd take a somewhat more broad view of transmuting

I find it bizarre to say that transmuting covers anything other than mem::transmute (such as casts), and the Reference and the Rustonomicon appear to disagree with your definition. Having an umbrella term ("conversions") and then specific kinds of conversions (coercions, casts, transmute) is more precise and less likely to cause confusion.

But even if we accept your definition, this PR lacks internal consistency in its terminology. Sometimes it talks about casts, sometimes about transmuting, sometimes it mentions both. @ChayimFriedman2, can you talk about your understanding of these words and how you have used them?

@traviscross
Copy link
Contributor

traviscross commented Dec 11, 2024

But even if we accept your definition, this PR lacks internal consistency in its terminology...

Agreed definitely that this PR should be more consistent on this point -- I should have mentioned that. So +1 for catching that in review.

I find it bizarre to say that transmuting covers anything other than mem::transmute (such as casts), and the Reference and the Rustonomicon appear to disagree with your definition.

I don't know. I reviewed the Reference text, and I think my definition aligns with it. E.g., it says:

Unions have no notion of an "active field". Instead, every union access transmutes parts of the content of the union to the type of the accessed field. Since transmutes can cause unexpected or undefined behaviour, unsafe is
required to read from a union field.

That's directly in line with my remark that:

So whether we do that with transmute or with union or by casting pointers and then dereferencing them, it's still a transmutation.

Elsewhere, it says:

Despite pointers and references being similar to usizes in the machine code emitted on most platforms, the semantics of transmuting a reference or pointer type to a non-pointer type is currently undecided. Thus, it may not be valid to transmute a pointer or reference type, P, to a [u8; size_of::<P>()].

If we were to narrowly define transmute as "it's not a transmute unless you call mem::transmute", then this paragraph would clearly need to be broadened.

Similarly, in the Nomicon, at the bottom of the page about transmutes, it says:

Also of course you can get all of the functionality of these functions using raw pointer casts or unions, but without any of the lints or other basic sanity checks. Raw pointer casts and unions do not magically avoid the above rules.

If we were to say that it's only a "transmute" if you call mem::transmute, that would seem to render the word largely useless, as we'd nearly always need to qualify it with "or semantically equivalent operations using mem::transmute_copy, union, or pointer casts".

This adds the lint against `&T`->`&UnsafeCell<T>` transmutes, and also check in struct fields, and reference casts (`&*(&a as *const u8 as *const UnsafeCell<u8>)`).

The code is quite complex; I've tried my best to simplify and comment it.

This is missing one parts: array transmutes. When transmuting an array, this only consider the first element. The reason for that is that the code is already quite complex, and I didn't want to complicate it more.
This catches the most common pattern of transmuting an array into an array of the same length with type of the same size; more complex cases are likely not properly handled.
We could take a bigger sample, for example the first and last elements to increase the chance that the lint will catch mistakes, but then the runtime complexity becomes exponential with the nesting of the arrays (`[[[[[T; 2]; 2]; 2]; 2]; 2]` has complexity of O(2**5), for instance).
@ChayimFriedman2 ChayimFriedman2 force-pushed the lint-transmute-unsafe-cell branch from 2ea0578 to ff6a3db Compare December 18, 2024 00:44
@ChayimFriedman2
Copy link
Contributor Author

To me "transmute" is any conversion that changes structures but leaves the bits as-is, and "cast" is specific to as (similar to @traviscross). I think the PR used this terminology in this meaning correctly (with only "transmute being used to describe std::mem::transmute() and to describe casts we use both interchangeably), but I fixed it to use it as @nnethercote said.

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-tools failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
tests/pass/float_nan.rs ... ok
tests/pass/0weak_memory_consistency.rs ... ok

FAILED TEST: tests/pass/atomic-readonly-load.rs
command: MIRI_ENV_VAR_TEST="0" MIRI_TEMP="/tmp/miri-uitest-tRwvH7" RUST_BACKTRACE="1" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage1/bin/miri" "--error-format=json" "--sysroot=/checkout/obj/build/x86_64-unknown-linux-gnu/miri-sysroot" "-Dwarnings" "-Dunused" "-Ainternal_features" "-Zui-testing" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools/miri_ui/tests/pass" "tests/pass/atomic-readonly-load.rs" "-Zmiri-tree-borrows" "--edition" "2021"
error: test got exit status: 1, but expected 0
 = note: compilation failed, but was expected to succeed

error: actual output differed from expected
---
+   |
+   = note: `-D unknown-lints` implied by `-D warnings`
+   = help: to override `-D warnings` add `#[allow(unknown_lints)]`
+
+error: converting &T to &UnsafeCell<T> is error-prone, rarely intentional and may cause undefined behavior
+   |
+   |
+LL |     let x = &X as *const i32 as *const AtomicI32;
+   |             ------------------------------------ conversion happend here
+LL |     let x = unsafe { &*x };
+   |
+   = note: conversion from `i32` to `std::sync::atomic::AtomicI32.v`
+   = note: `#[deny(unsafe_cell_conversions)]` on by default
+
---



FAILED TEST: tests/pass/tree_borrows/transmute-unsafecell.rs
command: MIRI_ENV_VAR_TEST="0" MIRI_TEMP="/tmp/miri-uitest-tRwvH7" RUST_BACKTRACE="1" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage1/bin/miri" "--error-format=json" "--sysroot=/checkout/obj/build/x86_64-unknown-linux-gnu/miri-sysroot" "-Dwarnings" "-Dunused" "-Ainternal_features" "-Zui-testing" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools/miri_ui/tests/pass/tree_borrows" "tests/pass/tree_borrows/transmute-unsafecell.rs" "-Zmiri-tree-borrows" "--edition" "2021"
error: test got exit status: 1, but expected 0
 = note: compilation failed, but was expected to succeed

error: actual output differed from expected
---
+   |
+   = note: `-D unknown-lints` implied by `-D warnings`
+   = help: to override `-D warnings` add `#[allow(unknown_lints)]`
+
+error: converting &T to &UnsafeCell<T> is error-prone, rarely intentional and may cause undefined behavior
+   |
+   |
+LL |     let cell_x: &UnsafeCell<i32> = mem::transmute(x);
+   |
+   |
+   = note: conversion from `*&i32` to `*&std::cell::UnsafeCell<i32>`
+
+error: miri cannot be run on programs that fail compilation
+
+error: aborting due to 3 previous errors
---

Location:
   /cargo/registry/src/index.crates.io-6f17d22bba15001f/ui_test-0.26.5/src/lib.rs:357

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
error: test failed, to rerun pass `--test ui`
Caused by:
Caused by:
  process didn't exit successfully: `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools/x86_64-unknown-linux-gnu/release/deps/ui-9d31558353352b68 --quiet` (exit status: 1)
Command has failed. Rerun with -v to see more details.
  local time: Wed Dec 18 01:13:28 UTC 2024
  network time: Wed, 18 Dec 2024 01:13:28 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

@traviscross
Copy link
Contributor

@rustbot labels +I-lang-nominated

The lang FCPed included approval of the name of the lint, so that can't be changed without lang assent. Nominating to discuss.

Also, overall, I feel like we've made this PR worse by switching everything to use the word "conversion" when that word is so much more broad.

@rustbot rustbot added the I-lang-nominated Nominated for discussion during a lang team meeting. label Dec 18, 2024
@traviscross traviscross added the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Dec 18, 2024
@asquared31415
Copy link
Contributor

Personally I have never seen the word "transmute" used to mean "specifically the function mem::transmute" without having that exact qualification. It's not particularly meaningful to talk about the function mem::transmute specifically unless considering its specific compile time guarantees about size checks.

"transmuting" is a different specific thing, done only via mem::transmute

the mem::transmute docs themselves say:

Transmuting pointers to integers in a const context is undefined behavior, unless the pointer was originally created from an integer. (That includes this function specifically, integer-to-pointer casts, and helpers like dangling, but also semantically-equivalent conversions such as punning through repr(C) union fields.)

and then uses the word "transmuting" to mean the general concept of a bitwise equivalent value with a different type in future documentation (of which there is a lot)

additionally the very start of the docs says

transmute is semantically equivalent to a bitwise move of one type into another

(where this transmute is the function itself)

So I would find it odd that you have to use a different phrase to include all semantically equivalent options, even though the namesake of "transmuting" says its equivalent.

@RalfJung
Copy link
Member

RalfJung commented Dec 18, 2024

In @rust-lang/opsem we also often use "transmute" to refer to any form of type punning / byte-level reinterpretation of the underlying representation, be it via the transmute function, via union fields, or via raw ptr type casts. Given that we intend these all to be semantically equivalent, I think it's good to use this term consistently for all of them.

@joshtriplett
Copy link
Member

"It's only a transmute if it's from the transmute region of std; otherwise it's just sparkling unsafety."

In all seriousness: I think we should defer to the framing of @rust-lang/opsem here and let "transmute" refer to anything equivalent to a transmute.

@scottmcm
Copy link
Member

I agree; if it's equivalent to a transmute it's fine to call it a transmute. Especially now that https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/syntax/enum.CastKind.html#variant.Transmute exists in MIR as well, so post-optimization it's possible for things to end up as "literally transmute" even if nobody wrote transmute in the input source code.

@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated

We discussed this in the lang call today. We agreed that a "transmute" represents the language-level concept, however expressed. We see the library function, mem::transmute, as being named after this language-level concept, not defining what it is.

For this PR to move forward, the name of the lint should be changed back to that which was FCPed, unsafe_cell_transmutes.

@rustbot rustbot removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Dec 19, 2024
@traviscross traviscross removed the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Dec 19, 2024
@apiraino
Copy link
Contributor

apiraino commented Dec 19, 2024

😭 that's a lot of new code to review

@ChayimFriedman2 I am looking at the diff. Is there a way to refactor this work to make it easier to review? Can this patch be split into smaller, easier to review bits? See our review guidelines. Thanks.

@nnethercote
Copy link
Contributor

nnethercote commented Dec 20, 2024

Much of the discussion can boil down the the following two examples.

    /// unsafe {
    ///     let x = 5_i32;
    ///     let y = std::mem::transmute::<&i32, &Cell<i32>>(&x);
    ///     y.set(6);
    ///
    ///     let z = &*(&x as *const i32 as *const Cell<i32>);
    ///     z.set(7);
    /// }

The first one is clearly a transmute. AIUI, everyone is saying the second one is also a transmute, albeit one that involves casts (of a pointer, and subsequent dereferencing of that pointer). I guess it makes sense, but I still would find an error message that describes the second one as transmuting a bit surprising.

I still think the is_type_cast function is misnamed, because it detects casts and transmutes.

At this point I will bow out of review duties because I'm about to go on vacation for five weeks. I'm definitely not a T-opsem person, as this whole discussion indicates. My earlier attempt to find an alternative reviewer was met with a crying emoji so I'll let somebody else decide who should review this. Apologies for all the confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.