Description
One of the easiest ways to make CI faster is to make things parallel and simply use the hardware we have available to us. Unfortunately though we don't have a lot of data about how parallel our build is. Are there steps we think are parallel but actually aren't? Are we pegged to one core for long durations when there's other work we could be doing?
The general idea here is that we'd spin up a daemon at the very start of the build which would sample CPU utilization every so often. This daemon would then update a file that's either displayed or uploaded at the end of the build.
Hopefully we could then use these logs to get a better view into how the builders are working during the build, diagnose non-parallel portions of the build, and implement fixes to use all the cpus we've got.
cc @rust-lang/infra