Description
With the advent of function pointers we will now see an increased number of indirect call cases that the jit can optimize to direct calls. This requires a transformation similar to the one we do for devirtualization. There may be additional wrinkles, say if someone takes the address of an intrinsic method we would want intrinsic recognition to kick in too. So in the importer perhaps this needs to sit just upstream from impImportCall
.
Overall this should be structured in a similar way to devirtualization -- opportunistically transforming calls during importation to enable subsequent inlining, and then perhaps retry later after inlining to at least remove overhead. Would be nice to be able to change over in the optimizer too, but that may prove challenging as we start to bake in many details in morph.
In principle we could do something similar with locally created delegates but there are a number of missing pieces that prevent us from seeing through from delegate creation to invocation.
using System;
class X
{
static int F() => 33;
public unsafe static int Main()
{
delegate*<int> f = &X.F;
return f() + 67;
}
}
produces
Assembly listing for method X:Main():int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
; V00 OutArgs [V00 ] ( 1, 1 ) lclBlk (32) [rsp+0x00] "OutgoingArgSpace"
;* V01 tmp1 [V01 ] ( 0, 0 ) long -> zero-ref "impImportIndirectCall"
;
; Lcl frame size = 40
G_M46779_IG01: ;; offset=0000H
4883EC28 sub rsp, 40
;; bbWeight=1 PerfScore 0.25
G_M46779_IG02: ;; offset=0004H
48B830F3FDD6FC7F0000 mov rax, 0x7FFCD6FDF330
FFD0 call rax
83C043 add eax, 67
;; bbWeight=1 PerfScore 3.50
G_M46779_IG03: ;; offset=0013H
4883C428 add rsp, 40
C3 ret
;; bbWeight=1 PerfScore 1.25
Not sure of the priority of this just yet, I am looking at some apps that use calli fairly heavily and need to better understand how many of these can be optimized. So marking as future for now.
This optimization will also intersect with guarded devirtualization / PGO, provided we can profile indirect targets and see biased distributions.
category:cq
theme:devirtualization
skill-level:expert
cost:medium
Metadata
Metadata
Assignees
Type
Projects
Status