Description
The source provided here, also used for reproduction of the previously filed #73593, takes an unexpectedly long time to compile and consumes substantial memory, growing exponentially as seemingly trivial complexity is added to the source. For example, uncommenting this line results in rustc crashing due to OOM killer on my system.
I apologize for the lack of MCVE but my attempts to investigate this have been stymied by my lack of understanding of the relevant portions of the compiler. This zulip thread contains the discussion of investigation so far, but the concise overview is that through use of massif
and perf
I've narrowed the performance issues (both compile time and compile memory) down to the obligations
Vec used here in rustc and, more specifically, the subsequent retain call for deduplication. Over half the CPU time for a full build of this crate is spent computing the Hash of ObligationCauseCode
in those insert
calls.
Identifying this hot path as the problem is informative but insufficient to allow me to minimize this. The most granular causative information I have is from the output of -Zself-profile
, which suggests that the time spent is in evaluate_obligation (as is clear from the above information as well) for queries such as
Canonical { max_universe: U0, variables: [CanonicalVarInfo { kind: Region(U0) }], value: ParamEnvAnd { param_env: ParamEnv { caller_bounds: [], reveal: UserFacing, def_id: None }, value: Binder(TraitPredicate(<Item as __DERIVE_PROTOCOL_Id::__protocol::Unravel<protocol_mve_transport::Transport<std::boxed::Box<dyn CloneSpawn>, std::convert::Infallible, std::convert::Infallible, Item>>>)) } }
The unifying theme among the queries that consume the vast majority of the total CPU time is that they're all root-level evaluations of predication on the Unravel
trait, and there doesn't appear to be a way to achieve further granularity with -Zself-profile
such that I could establish what resultant predicates are being produced that in turn would result in the voluminous contents of that obligations
Vec, whatever they might be. I'm not sure if the issue here is a failure/missed opportunity in deduplication, an accumulation of a huge number of predicates before deduplication such that a single iteration consumes all this time (perf
records using sampling so I don't know whether many small calls or fewer more time-consuming calls are being made), or something else entirely.
For profiling I'm using a locally built stage 1 rustc on x86_64-unknown-linux-gnu
from commit 8aa18cbdc5d4bc33bd61e2d9a4b643d87f5d21de
though this issue occurs on current stable etc. as well.