-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Insert checks for enum discriminants when debug assertions are enabled #141759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt This PR changes MIR cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @vakaras Some changes occurred in compiler/rustc_codegen_ssa Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred to the CTFE machinery rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead. cc @rust-lang/rust-analyzer |
This comment has been minimized.
This comment has been minimized.
6d3fe75
to
a7dd718
Compare
This comment has been minimized.
This comment has been minimized.
a7dd718
to
4f3342e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
54b6e74
to
b03960e
Compare
This comment has been minimized.
This comment has been minimized.
b03960e
to
228b656
Compare
This comment has been minimized.
This comment has been minimized.
228b656
to
d1d8f88
Compare
This comment has been minimized.
This comment has been minimized.
d1d8f88
to
93b24d7
Compare
This comment has been minimized.
This comment has been minimized.
93b24d7
to
c2a8415
Compare
This comment has been minimized.
This comment has been minimized.
c2a8415
to
d769d6b
Compare
This comment has been minimized.
This comment has been minimized.
d769d6b
to
68665ad
Compare
This comment has been minimized.
This comment has been minimized.
68665ad
to
1225079
Compare
This comment has been minimized.
This comment has been minimized.
1225079
to
c52f534
Compare
This comment has been minimized.
This comment has been minimized.
d37a37e
to
33890a1
Compare
This comment has been minimized.
This comment has been minimized.
33890a1
to
c871c4c
Compare
Similar to the existing nullpointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following: ```rust let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) }; ``` An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++. This check is similar to Miri's capabilities of checking for valid construction of enum values. This PR is inspired by saethlin@'s PR rust-lang#104862. Thank you so much for keeping this code up and the detailed comments! I also pair-programmed large parts of this together with vabr-g@.
This comment has been minimized.
This comment has been minimized.
c871c4c
to
5587fd7
Compare
@bors2 try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Insert checks for enum discriminants when debug assertions are enabled Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following: ```rust let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) }; ``` An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++. This check is similar to Miri's capabilities of checking for valid construction of enum values. This PR is inspired by saethlin@'s PR #104862. Thank you so much for keeping this code up and the detailed comments! I also pair-programmed large parts of this together with vabr-g@. r? `@saethlin`
This patch is finally ready for review! Let's see what the perf-impact of this is, but I wouldn't assume it is much, as this only emits checks for transmutes to enums. |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (7488b2b): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (secondary 3.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -0.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 754.321s -> 757.215s (0.38%) |
/// In some cases the enum discriminant is stored in a tag that is represented by | ||
/// primitive. This method returns the actual discriminant type and size for that | ||
/// tag. | ||
fn tag_type_and_size_for_primitive(&self, primitive: Primitive) -> (Ty<'tcx>, Size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this feels like something that we've got to have somewhere already.
In a quick search I only found the codegen-side versions, though, like
rust/compiler/rustc_codegen_ssa/src/traits/type_.rs
Lines 55 to 64 in 1434630
fn type_from_integer(&self, i: Integer) -> Self::Type { | |
use Integer::*; | |
match i { | |
I8 => self.type_i8(), | |
I16 => self.type_i16(), | |
I32 => self.type_i32(), | |
I64 => self.type_i64(), | |
I128 => self.type_i128(), | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I felt the same when I was writing this code but also only found this in codegen. I can look again, but would it make any sense to have this as a member of Primitive
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, here it is for Primitive -> Ty
: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/layout/trait.PrimitiveExt.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for getting a Size
, there's already https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.Primitive.html#method.size
source_op: Operand<'tcx>, | ||
discr_ty: Ty<'tcx>, | ||
discr_size: Size, | ||
op_size: Size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: maybe you can pass the Ty
or Primitive
or Integer
or something instead of the Size
? The Primitive
at least is definitely available from the layout info.
(That might help avoid some match
es.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I introduced a TyAndSize
type that brings down a lot of duplication. Other than that I didn't really see a big benefit passing down the Primitive
, because in order to get anything useful out of it I'd need more matches... Are you fine with how the code looks now? :)
}); | ||
} | ||
|
||
// Branch based on the computed equality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: If you want to branch based on a set of values, how about inserting a https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.TerminatorKind.html#variant.SwitchInt instead? That can branch to the "check failed" block from the otherwise
, and continue successfully from all the valid values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh good point! I would do this if we decide to stick with this comparison.
))), | ||
}); | ||
|
||
// Loop over the list of the discriminants and insert checks for equality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this can be a very large amount of additional MIR. I worry about things like the transmute from usize
to ptr::Alignment
, for example -- adding another, what, at least 128 MIR statements every time that happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was already thinking about this... my perception was that such enums are not super prevalent, but ptr::Alignment
does look relevant.
One idea I've had was that most enums have contiguous range (ptr::Alignment
is a bad example :/) and we could transform this list to a list of WrappingRange
s and then compare them. As I said that wouldn't work for ptr::Alignment
though...
We could also think about excluding everything with more than e.g. 10 variants and have an option to opt-in for all checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every time that happens?
I'm not sure it happens very often. Such transmutes are usually done in a helper function, so they should not be inserted very many times in a debug build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the perf results suggest that it does not happen very often. At least currently.
use rustc_session::Session; | ||
use tracing::debug; | ||
|
||
/// This pass inserts checks at places where enums are constructed and checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which places? (Also you should probably say "locations" here, it's more similar to the types)
If you don't want to be too specific about where checks are inserted maybe talk around why the pass currently inserts the checks it does. "We insert checks where they are most likely to find UB, because checking everywhere like Miri would generate too much MIR". Or something like that.
// An empty enum that tries to be constructed from an inhabited value, this | ||
// is never correct. | ||
Variants::Empty => { | ||
// The enum layout is uninhabited but we construct it from sth inhabited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this case already detected statically?
|
||
fn main() { | ||
// CHECK-LABEL: fn main( | ||
// CHECK assert(copy .*, "trying to construct an enum from an invalid value {}", .*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are there mir-opt tests in here? I can't figure out what they are testing for that the UI tests are not.
Also, I think this CHECK is just checking for any enum check after the start of main
. Writing the FileCheck annotations well seems like a lot of work here. So they should be testing for something that's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested them because I wanted to read the MIR in the .diff instead of puzzling it from the code.
Turning off file-check for them would be completely fine, though, if the CHECK
s aren't that useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tests stay in, they need a clear comment in the test that explains what the expected diff is, so that subsequent contributors and reviewers are able to discern if the test "passed" or not when the diff changes.
+ _7 = Eq(copy _4, const 1_u128); | ||
+ _5 = BitOr(copy _7, copy _5); | ||
+ _8 = copy _4 as usize (IntToInt); | ||
+ assert(copy _5, "trying to construct an enum from an invalid value {}", copy _8) -> [success: bb1, unwind unreachable]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pondering: why is a usize
passed to this assert
? What happens if you use repr(i128)
on the enum and make a variant with i128::MIN
as the discriminant?
Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following:
An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++.
This check is similar to Miri's capabilities of checking for valid construction of enum values.
This PR is inspired by saethlin@'s PR
#104862. Thank you so much for keeping this code up and the detailed comments!
I also pair-programmed large parts of this together with vabr-g@.
r? @saethlin