Skip to content

Missing caching for HRTB projection equality bounds (for<'x> T: Trait<'x, Assoc = ...>). #99188

Open

Description

Originally reduced from a production application using the tower crate - see @fasterthanlime's reduced repro repo for more background (though note that it exhibits several other failure modes as well)

trait Trait<'a> {
    type A;
    type B;
    type C;
    type D;

    fn method() {}
}

impl<T> Trait<'_> for &'_ T
where
    for<'x> T: Trait<'x, A = (), B = (), C = (), D = ()>,
{
    type A = ();
    type B = ();
    type C = ();
    type D = ();
}

impl Trait<'_> for () {
    type A = ();
    type B = ();
    type C = ();
    type D = ();
}

pub fn main() {
    #[cfg(depth = "7")]
    <&&&&&&&() as Trait>::method();
    #[cfg(depth = "8")]
    <&&&&&&&&() as Trait>::method();
    #[cfg(depth = "9")]
    <&&&&&&&&&() as Trait>::method();
    #[cfg(depth = "10")]
    <&&&&&&&&&&() as Trait>::method();
}

The above example currently takes an exponential amount of time to compile, based on the type depth:

$ curl -O https://gist.githubusercontent.com/eddyb/cd4221f14fff265280d135ddce5c9712/raw/a17cbf00af1894756d2b1bdd2e838cdd4db2bbe2/proj-exp.rs
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "7"'
took 0.74s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "8"'
took 2.98s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "9"'
took 11.92s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "10"'
took 51.10s

With every extra type layer, the time increases by ~4x, and that aligns well with there being 4 associated types.

While this example is a bit silly, it doesn't take more than two associated types (both constrained at once) to cause the issue (although at higher depth or with a larger constant factor from having additional bounds).

And I'm guessing it might be one of the main causes for applications built with the tower crate to experience long compile times (its main trait, Service, has Response and Error as two associated types), in large part because that's the kind of codebase this was originally reduced from (as per the note at the top of this issue).


The reason I suspect caching is a combination of factors:

  • instrumenting evaluate_predicate_recursively found an exponential ramp in terms of duplicates (i.e. the number of times each unique obligation shows up), and many of them were ProjectionPredicates
    • this was done via -Z self-profile, though RUSTC_LOG might also work (and not require compiler changes)
  • many of those ProjectionPredicates' had a bound lifetime listed in their Binder
  • attempting to create a ProjectionCacheKey returns None iff there are bound variables
    • the comment is a bit worrying, given that it e.g. talks about "escaping regions" but checks for bounds ones
    • (as an aside, ProjectionCacheKey not holding a ParamEnv is probably risky long-term)
    • I've talked to @nikomatsakis and we're unsure if this is a historical artifact - my best guess is that it might have something to do with the fact that normalizing under binders didn't use to work (maybe a micro-opt to not even bother trying to cache those cases? were they that common?)
  • replacing the for<'x> T: Trait<'x, ... with T: Trait<'static, ... removes the exponential slowdown

So the next step was to to try always caching (warning: this is actually an unsound quick hack, ProjectionCacheKey should be modified to carry a Binder<ProjectionTy> instead of a ProjectionTy):

diff --git a/compiler/rustc_trait_selection/src/traits/project.rs b/compiler/rustc_trait_selection/src/traits/project.rs
index b3e7fbb3578..5d5f2f67842 100644
--- a/compiler/rustc_trait_selection/src/traits/project.rs
+++ b/compiler/rustc_trait_selection/src/traits/project.rs
@@ -2123,7 +2123,7 @@ fn from_poly_projection_predicate(
         let infcx = selcx.infcx();
         // We don't do cross-snapshot caching of obligations with escaping regions,
         // so there's no cache key to use
-        predicate.no_bound_vars().map(|predicate| {
+        Some(predicate.skip_binder()).map(|predicate| {
             ProjectionCacheKey::new(
                 // We don't attempt to match up with a specific type-variable state
                 // from a specific call to `opt_normalize_projection_type` - if

However, the above hack does not appear to help and we're unsure why - the main other kind of Predicate that evaluate_predicate_recursively appears to process is TraitPredicate, which IIUC is almost always locally cached?

(EDIT: turns out that the caching is more complex than initially assumed and requires further changes - see #99188 (comment) for a success)

Then again, there are other reductions that don't involve ProjectionPredicate and still cause their own exponential curves, so there might be several different caching issues.


I primarily wanted to open this issue on its own because there's definitely something weird with ProjectionCacheKey (even if a proper fix might be way more involved than just it), and there's a clear correlation between the number of associated types (being constrained) and the exponential curve.

cc @rust-lang/types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-associated-itemsArea: Associated items (types, constants & functions)A-higher-rankedArea: Higher-ranked things (e.g., lifetimes, types, trait bounds aka HRTBs)A-traitsArea: Trait systemI-compiletimeIssue: Problems and improvements with respect to compile times.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-typesRelevant to the types team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions