The tail_expr_drop_order
lint today gives a lot of false positives #130836
Description
cc @traviscross
TL;DR
We have implemented a HIR-based lint against changes in drop order due to #![feature(shorter_tail_lifetimes)]
.
Through a crater run we learnt that this approach produces a lot of false positives. We need to account for control flow and track the use of places and partial moves. How can we achieve this?
Current status
The current approach to identify temporaries in tail expression whose drop order may change is to identify local variable declarations as well as (sub)expressions of the tail expression that are of types with significant drop implementation. The lint will then simply assume that all of them will observe a change in drop order and report as such.
For example, the following code is linted against.
#![deny(tail_expr_drop_order)]
#![feature(shorter_tail_lifetimes)]
struct LoudDropper;
impl Drop for LoudDropper {
fn drop(&mut self) {
// This destructor should be considered significant because it is a custom destructor
// and we will assume that the destructor can generate side effects arbitrarily so that
// a change in drop order is visible.
println!("loud drop");
}
}
impl LoudDropper {
fn get(&self) -> i32 {
0
}
}
fn should_lint() -> i32 {
let x = LoudDropper;
// ^ this has significant drop
x.get() + LoudDropper.get()
//~^ ERROR: this value of type `LoudDropper` has significant drop implementation that will have a different drop order from that of Edition 2021
//~| WARN: this changes meaning in Rust 2024
}
MIR shows that the temporary value _6
is indeed dropped later that _1
aka. x
.
MIR
fn should_lint() -> i32 {
let mut _0: i32;
let _1: LoudDropper;
let mut _2: i32;
let mut _3: &LoudDropper;
let mut _4: i32;
let mut _5: &LoudDropper;
let _6: LoudDropper;
let mut _7: (i32, bool);
scope 1 {
debug x => const LoudDropper;
}
bb0: {
_3 = &_1;
_2 = LoudDropper::get(move _3) -> [return: bb1, unwind: bb8];
}
bb1: {
_5 = &_6;
_4 = LoudDropper::get(move _5) -> [return: bb2, unwind: bb6];
}
bb2: {
_7 = AddWithOverflow(_2, _4);
assert(!move (_7.1: bool), "attempt to compute `{} + {}`, which would overflow", move _2, move _4) -> [success: bb3, unwind: bb6];
}
bb3: {
_0 = move (_7.0: i32);
drop(_1) -> [return: bb4, unwind: bb7];
}
bb4: {
drop(_6) -> [return: bb5, unwind continue];
}
bb5: {
return;
}
bb6 (cleanup): {
drop(_1) -> [return: bb7, unwind terminate(cleanup)];
}
bb7 (cleanup): {
drop(_6) -> [return: bb9, unwind terminate(cleanup)];
}
bb8 (cleanup): {
drop(_1) -> [return: bb9, unwind terminate(cleanup)];
}
bb9 (cleanup): {
resume;
}
}
Drawbacks
This lint is now too naive. For instance, the following example clearly will not have the drop order changed. The analysis does not consider the fact that temporary values can be moved and the drop on the locals or places are effectively no-ops.
fn should_not_lint_when_consumed() -> (LoudDropper, i32) {
let x = LoudDropper;
// Should not lint
(LoudDropper, x.get())
// ^~~~~~~~~~~ this is moved
}
This is a fairly common case. From the last crater run we learnt it from the fact that serde::Deserialize
macro keeps the partially deserialized data as locals and later moves them into the return value. The lint was triggered even though those values are not dropped at the function return point.
New attempts
Therefore, we need to improve the lint so that the analysis is more precise by factoring in the control flow and the kind of use of various MIR places, especially partial moves. Solving this issue perfectly requires one to inspect MIR instead, in conjunction with MaybeUninitializedPlaces
data flow. Identifying which statements in MIR is today accurate enough, but we can't say the same for drop
terminators. In the earliest attempts of #129864, span information in the terminators turned out to be pointing outside the tail expression span even though they properly fulfill the drop schedules for the tail expression. This gave us even more false positives and negatives.
As an example, we saw that the drop happened at the last closing bracket of this code from the MIR source_info
on the drop terminators.
fn should_lint() -> i32 {
let x = LoudDropper;
// v~~~~~~~~~~~~~~~~~~~~~~~~~~ the tail expression span is this
x.get() + LoudDropper.get()
} // <-- both `x` and `LoudDropper` are dropped here
// ... even though they belong to different scopes
// or they are dropped for different "reasons"
One possible way to address this incorrect information is to extend MIR syntax so that drop
terminators carry information on which exactly the HIR region scope the drops are scheduled for. With it we can precisely differentiate drops for the proper tail expression, the drops for the locals in the body and, in particular, Edition 2021 temporary values that dropped not as part of the tail expression drop schedule. Those are the targets this lint is exactly looking for.
Drawbacks
Due to copying and retention of this scope information in MIRs, there is now regression on compiler performance.
Questions
- Are there better ways to track the scope that a drop in a built MIR is scheduled for without so much overhead?
Activity