-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
With dynamic PGO we generally expect that the schema the jit produces when instrumenting will exactly match the one we expect to see when optimizing. I happened across this case in the asp.net collection where this isn't the case.
;; 98458,System.DateTime:get_Kind,"int32 get_Kind()", DEBUG_INFO FROZEN_ALLOC_ALLOWED SKIP_VERIFICATION BBOPT TIER1 HAS_PGO HAS_EDGE_PROFILE HAS_DYNAMIC_PROFILE, 0
*************** Inline @[000001] Starting PHASE Profile incorporation
Have Dynamic PGO: 3 schema records (schema at 000001B79CAA5458, data at 000001B79CA7C178)
Profile summary: 1 runs, 0 block probes, 3 edge probes, 0 class profiles, 0 method profiles, 0 other records
Reconstructing block counts from sparse edge instrumentation
... adding known edge BB04 -> BB07: weight 0
... adding known edge BB06 -> BB07: weight 0
... adding known edge BB07 -> BB01: weight 237472
New BlockSet epoch 1, # of blocks (including unused BB00): 8, bitset array size: 1 (short)
... unknown edge BB01 -> BB04
... unknown edge BB01 -> BB02
... unknown edge BB02 -> BB05
... unknown edge BB02 -> BB03
... unknown edge BB03 -> BB06
Did not expect tree edge BB06 -> BB07 to be present in the schema (key 00000020, 00000022)
... pseudo edge BB07 -> BB01
Schema is missing non-tree edge BB05 -> BB07, will presume zero
... known edge BB05 -> BB07
... known edge BB04 -> BB07
... not solving because of the mismatch
... discarding profile count data: PGO data available, but IL did not match
Computing inlinee profile scale:
... no callee profile data, will use non-pgo weight to scale
call site count 100 callee entry count 100 scale 1
Scaling inlinee blocks
Writing out flow graph after phase Profile incorporation
Here the jit is surprised to see that the schema edges (non-tree edges) don't match its own non-tree edges, and so it throws away all the profile data.
Digging back in the SPMI collection I found the compilation that produced that schema. It read in an existing static schema and then created a divergent dynamic schema:
;;; 96562,System.DateTime:get_Kind,"int32 get_Kind()", DEBUG_INFO FROZEN_ALLOC_ALLOWED SKIP_VERIFICATION BBINSTR BBOPT TIER1 HAS_PGO HAS_EDGE_PROFILE HAS_STATIC_PROFILE, 0
*************** Starting PHASE Profile incorporation
Have Static PGO: 3 schema records (schema at 000002C500F66578, data at 000002C500F3E360)
Profile summary: 1 runs, 0 block probes, 3 edge probes, 0 class profiles, 0 method profiles, 0 other records
Reconstructing block counts from sparse edge instrumentation
... adding known edge BB04 -> BB07: weight 8872
... adding known edge BB05 -> BB07: weight 8880
... adding known edge BB07 -> BB01: weight 17456
*************** Starting PHASE Profile instrumentation prep
Using edge profiling
EfficientEdgeCountInstrumentor: preparing for instrumentation
[0] New probe for BB07 -> BB01 [source]
[1] New probe for BB06 -> BB07 [source]
[2] New probe for BB04 -> BB07 [source]
7 blocks, 3 probes (0 on critical edges)
Writing out flow graph after phase Profile instrumentation prep
The dynamic schema takes priority and any further jitting of this method then fails to incorporate the data.
The issue seems to be that the spanning tree formation is impacted by the profile data that gets incorporated, in particular this bit of code:
runtime/src/coreclr/jit/fgprofile.cpp
Lines 1138 to 1141 in f924d6b
| if (block->isRunRarely() || !target->isRunRarely()) | |
| { | |
| continue; | |
| } |
is reacting to the fact that the static profile has marked certain blocks as run rarely.