Skip to content

The tail_expr_drop_order lint today gives a lot of false positives #130836

Open
@dingxiangfei2009

Description

cc @traviscross

TL;DR

We have implemented a HIR-based lint against changes in drop order due to #![feature(shorter_tail_lifetimes)].
Through a crater run we learnt that this approach produces a lot of false positives. We need to account for control flow and track the use of places and partial moves. How can we achieve this?

Current status

The current approach to identify temporaries in tail expression whose drop order may change is to identify local variable declarations as well as (sub)expressions of the tail expression that are of types with significant drop implementation. The lint will then simply assume that all of them will observe a change in drop order and report as such.

For example, the following code is linted against.

#![deny(tail_expr_drop_order)]
#![feature(shorter_tail_lifetimes)]

struct LoudDropper;
impl Drop for LoudDropper {
    fn drop(&mut self) {
        // This destructor should be considered significant because it is a custom destructor
        // and we will assume that the destructor can generate side effects arbitrarily so that
        // a change in drop order is visible.
        println!("loud drop");
    }
}
impl LoudDropper {
    fn get(&self) -> i32 {
        0
    }
}

fn should_lint() -> i32 {
    let x = LoudDropper;
    //  ^ this has significant drop
    x.get() + LoudDropper.get()
    //~^ ERROR: this value of type `LoudDropper` has significant drop implementation that will have a different drop order from that of Edition 2021
    //~| WARN: this changes meaning in Rust 2024
}

MIR shows that the temporary value _6 is indeed dropped later that _1 aka. x.

MIR
fn should_lint() -> i32 {
    let mut _0: i32;
    let _1: LoudDropper;
    let mut _2: i32;
    let mut _3: &LoudDropper;
    let mut _4: i32;
    let mut _5: &LoudDropper;
    let _6: LoudDropper;
    let mut _7: (i32, bool);
    scope 1 {
        debug x => const LoudDropper;
    }

    bb0: {
        _3 = &_1;
        _2 = LoudDropper::get(move _3) -> [return: bb1, unwind: bb8];
    }

    bb1: {
        _5 = &_6;
        _4 = LoudDropper::get(move _5) -> [return: bb2, unwind: bb6];
    }

    bb2: {
        _7 = AddWithOverflow(_2, _4);
        assert(!move (_7.1: bool), "attempt to compute `{} + {}`, which would overflow", move _2, move _4) -> [success: bb3, unwind: bb6];
    }

    bb3: {
        _0 = move (_7.0: i32);
        drop(_1) -> [return: bb4, unwind: bb7];
    }

    bb4: {
        drop(_6) -> [return: bb5, unwind continue];
    }

    bb5: {
        return;
    }

    bb6 (cleanup): {
        drop(_1) -> [return: bb7, unwind terminate(cleanup)];
    }

    bb7 (cleanup): {
        drop(_6) -> [return: bb9, unwind terminate(cleanup)];
    }

    bb8 (cleanup): {
        drop(_1) -> [return: bb9, unwind terminate(cleanup)];
    }

    bb9 (cleanup): {
        resume;
    }
}

Drawbacks

This lint is now too naive. For instance, the following example clearly will not have the drop order changed. The analysis does not consider the fact that temporary values can be moved and the drop on the locals or places are effectively no-ops.

fn should_not_lint_when_consumed() -> (LoudDropper, i32) {
    let x = LoudDropper;
    // Should not lint
    (LoudDropper, x.get())
//   ^~~~~~~~~~~ this is moved
}

This is a fairly common case. From the last crater run we learnt it from the fact that serde::Deserialize macro keeps the partially deserialized data as locals and later moves them into the return value. The lint was triggered even though those values are not dropped at the function return point.

New attempts

Therefore, we need to improve the lint so that the analysis is more precise by factoring in the control flow and the kind of use of various MIR places, especially partial moves. Solving this issue perfectly requires one to inspect MIR instead, in conjunction with MaybeUninitializedPlaces data flow. Identifying which statements in MIR is today accurate enough, but we can't say the same for drop terminators. In the earliest attempts of #129864, span information in the terminators turned out to be pointing outside the tail expression span even though they properly fulfill the drop schedules for the tail expression. This gave us even more false positives and negatives.

As an example, we saw that the drop happened at the last closing bracket of this code from the MIR source_info on the drop terminators.

fn should_lint() -> i32 {
    let x = LoudDropper;
//  v~~~~~~~~~~~~~~~~~~~~~~~~~~ the tail expression span is this
    x.get() + LoudDropper.get()
} // <-- both `x` and `LoudDropper` are dropped here
// ... even though they belong to different scopes
// or they are dropped for different "reasons"

One possible way to address this incorrect information is to extend MIR syntax so that drop terminators carry information on which exactly the HIR region scope the drops are scheduled for. With it we can precisely differentiate drops for the proper tail expression, the drops for the locals in the body and, in particular, Edition 2021 temporary values that dropped not as part of the tail expression drop schedule. Those are the targets this lint is exactly looking for.

Drawbacks

Due to copying and retention of this scope information in MIRs, there is now regression on compiler performance.

Questions

  • Are there better ways to track the scope that a drop in a built MIR is scheduled for without so much overhead?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    C-discussionCategory: Discussion or questions that doesn't represent real issues.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-langRelevant to the language team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions