Description
The jit's emitter works by combining instructions into instruction groups (hereafter IGs). The group boundaries can be formed naturally by labels or induced by the emitter when an internal buffer fills up.
The buffer capacity varies between CHK and REL builds, and the size of the buffer entries varies depending on the instructions being emitted. So the location of the induced IG breaks is not consistent between CHK and REL. This difference in behavior can lead to differences in codegen between CHK and REL.
In the one case I've seen, the differences come about as follows. During IG formation, the jit estimates instruction sizes and hence the size and offset of each IG. When actually emitting the code the jit may discover the instructions in an IG are smaller than predicted. This leads to cascading updates of the offsets of subsequent IGs. If in such an IG there is first a mis-estimated instruction M and then an instruction L that refers to an instruction in a subsequent IG, the offset computation for L will not take into account the correction from the mis-estimated M. If instead M and L are in different groups, L's offset will have the correction.
Now because CHK and REL form IGs differently, it is possible for M and L to be together in one IG in RET but not in CHK, hence the CHK offset and REL offset differ.
I don't have a repro in CoreCLR for this yet, but here's an example bit of disassembly from a desktop test that shows the difference:
;;; CHK
000001A13F620B52 48 8D 05 17 00 00 00 lea rax,[000001A13F620B70h] ;; offset B70
000001A13F620B59 48 89 45 40 mov qword ptr [rbp+40h],rax
000001A13F620B5D 48 8B 45 68 mov rax,qword ptr [rbp+68h]
000001A13F620B61 C6 40 0C 00 mov byte ptr [rax+0Ch],0
000001A13F620B65 48 8B 85 10 01 00 00 mov rax,qword ptr [rbp+0000000000000110h]
000001A13F620B6C FF D0 call rax
000001A13F620B6E 48 8B 55 68 mov rdx,qword ptr [rbp+68h]
;;; REL
000001A13F590B52 48 8D 05 16 00 00 00 lea rax,[000001A13F590B6Fh] ; offset B6F
000001A13F590B59 48 89 45 40 mov qword ptr [rbp+40h],rax
000001A13F590B5D 48 8B 45 68 mov rax,qword ptr [rbp+68h]
000001A13F590B61 C6 40 0C 00 mov byte ptr [rax+0Ch],0
000001A13F590B65 48 8B 85 10 01 00 00 mov rax,qword ptr [rbp+0000000000000110h]
000001A13F590B6C FF D0 call rax
000001A13F590B6E 48 8B 55 68 mov rdx,qword ptr [rbp+68h]
This code sequence comes from an inline pinvoke; the LEA is supposed to be storing the return address from the call. CHK and RET are inconsistent here.
It turns out both are actually wrong, and the offset should be B6E, but that is a separate issue (#13398 #8747).
Since it is unlikely we can ever avoid mis-estimating sizes, the underlying algorithm to cope with mis-estimates likely needs to be rethought.
cc @dotnet/jit-contrib
category:correctness
theme:jit-coding-style
skill-level:intermediate
cost:medium