Description
Overview
This issue is in relation to optimizing codegen scheduling. To better profile codegen and see where/what we need to optimize, we need to get metrics for individual codegen units. We can uniquely identify CGU's using their names, so there is already a mechanism for tracking individual CGU's.
CGU Size
Tracking CGU size by itself should not be too hard; at some arbitrary point in time we determine the CGU's size and map this quantity to a CGU ID for later lookup. @Mark-Simulacrum mentioned that we should have support for knowing if a CGU is big just because, or because it absorbed other CGU's; and that this is something that should be displayed on perf.r-l.o. my guess is we can do this using a similar structure to what's already in place in SelfProfiler
:
pub struct SelfProfiler {
profiler: Profiler,
event_filter_mask: EventFilter,
string_cache: RwLock<FxHashMap<String, StringId>>,
query_event_kind: StringId,
generic_activity_event_kind: StringId,
incremental_load_result_event_kind: StringId,
incremental_result_hashing_event_kind: StringId,
query_blocked_event_kind: StringId,
query_cache_hit_event_kind: StringId,
/*** NEW ***/
codegen_unit_merge_event_kind: StringId,
}
However, as I understand it, this approach ignores tracking which CGU's have been merged together and just informs
us when and where a merge has occurred; shouldn't we also know what got merged? Or does using a CodegenUnitId
solve this?
Lastly, it has been mentioned that the size metric is not currently accurate, and that the quantity of LLVM instructions
might not provide enough insight. Until I know more, I plan on using CGU instruction count as the size metric.
To Do
- identify individual CGU's
- Change size metric to reflect inst counts of CGU's
- Add support for tracking which CGU's have been merged; and where and when it happened.