Description
I am trying to get rid of some pessimizations following first-class-structs design, cc @CarolEidt.
The main performance issue is struct returns that are wrapped as int or long types. In these cases even if we inline these calls we can't promote/enregister structs or their fields.
To stop retyping we need to stop calling impFixupCallStructReturn
and fgFixupStructReturn
, like this.
but then you will get errors in rationalize because it doesn't expect ASG struct
trees.
I see two possible solutions:
- teach rationalize how to work with these types;
- start generating
STORE_LCL/BLK/OBJ
in importer, like ingtNewBlkOpNode
gentree.cpp.
the second interferes with the work of deleting ASG
from @mikedn, so that issue is created to discuss how we can make progress without creating merge conflicts. @mikedn how do you see ASG
nodes replacement, are you planning to delete it completely, including importation phase?
An example how we keep struct type with the second approach
1. System.IO.FileSystem:FillAttributeInfo(System.String,byref,bool):int (MethodHash=333db0b0) (from VectorAdd_r.dll)for such IL:
[ 0] 10 (0x00a) call 0600018E
[ 1] 15 (0x00f) stloc.1
the old generates:
STMT00004 (IL ???... ???)
[000013] -AC--------- * ASG long
[000012] *----------- +--* IND long
[000011] ------------ | \--* ADDR byref
[000010] ------------ | \--* LCL_VAR struct<System.IO.DisableMediaInsertionPrompt, 8> V04 loc1
[000009] --C--------- \--* RET_EXPR long (inl return from call [000008])
then transforms it to:
N003 ( 18, 10) [000013] DACXG------- * STORE_LCL_FLD long V04 loc1 [+0]
* bool V04._disableSuccess (offs=0x00) -> V15 tmp8
* int V04._oldMode (offs=0x04) -> V16 tmp9
and then generates:
mov qword ptr [V04 rbp-290H], rax
the new importer:
[000012] -AC--------- * STORE_LCL_VAR struct<System.IO.DisableMediaInsertionPrompt, 8> V04 loc1
[000009] --C--------- \--* RET_EXPR struct(inl return from call [000008])
then:
N001 ( 14, 5) [000008] --CXG------- t8 = CALL struct System.IO.DisableMediaInsertionPrompt.Create
/--* t8 struct
N003 ( 18, 8) [000012] DA-XG------- * STORE_LCL_VAR struct<System.IO.DisableMediaInsertionPrompt,
then:
mov qword ptr [V04 rbp-29CH], rax
so we can keep STRUCT
type and with changes in lowering can enregister them.
Other examples.
2. from VectorAddTest`1[Single][System.Single]:VectorAdd(float,float,float):int (MethodHash=051d27ab) (from VectorAdd_r.dll)for such IL:
[ 0] 1 (0x001) ldloca.s 0
[ 1] 3 (0x003) ldarg.0
[ 2] 4 (0x004) call 0A00002B
the old importer generates:
STMT00002 (IL 0x001... ???)
[000007] -A--G------- * ASG simd32 (copy)
[000006] ----G--N---- +--* BLK simd32<32>
[000003] ------------ | \--* ADDR byref
[000002] ------------ | \--* LCL_VAR simd32<System.Numerics.Vector`1[Single]> V03 loc0
[000005] -------N---- \--* SIMD simd32 float init
[000004] ------------ \--* LCL_VAR float V00 arg0
then rationalize changes it to:
N001 ( 3, 4) [000004] ------------ t4 = LCL_VAR float V00 arg0
/--* t4 float
N002 ( 4, 5) [000005] -------N---- t5 = * SIMD simd32 float init
/--* t5 simd32
N004 ( 8, 8) [000007] DA--G------- * STORE_LCL_VAR simd32<System.Numerics.Vector`1[Single]> V03
and the final result is:
Generating: N030 ( 3, 4) [000004] ------------ t4 = LCL_VAR float V00 arg0 mm0 REG mm0
IN0005: vmovss xmm0, dword ptr [V00 rbp+10H]
/--* t4 float
Generating: N032 ( 4, 5) [000005] -------N---- t5 = * SIMD simd32 float init REG mm0
IN0006: vbroadcastss ymm0, ymm0
/--* t5 simd32
Generating: N034 ( 8, 8) [000007] DA--G------- * STORE_LCL_VAR simd32<System.Numerics.Vector`1[Single]> V03 loc0 NA REG NA
IN0007: vmovupd ymmword ptr[V03 rbp-30H], ymm0
Added IP mapping: 0x0009 STACK_EMPTY (G_M55380_IG04,ins#4,ofs#19)
the new importer generates:
[000006] ------------ * STORE_LCL_VAR simd32<System.Numerics.Vector`1[Single]> V03 loc0
[000005] -------N---- * SIMD simd32 float init
[000004] ------------ \--* LCL_VAR float V00 arg0
and doesn't do any tranformations later.
in order to keep STORE_LCL_VAR add DA flags.
- System.IO.FileSystem:FillAttributeInfo(System.String,byref,bool):int (MethodHash=333db0b0) (from VectorAdd_r.dll)
for such IL:
[ 0] 98 (0x062) ldloca.s 2
[ 1] 100 (0x064) initobj 0200000C
the old importer:
[000102] IA---------- * ASG struct (init)
[000099] D------N---- +--* LCL_VAR struct<WIN32_FIND_DATA, 592> V05 loc2
[000101] ------------ \--* CNS_INT int 0
then
N001 ( 1, 1) [000101] ------------ t101 = CNS_INT int 0
N002 ( 3, 2) [000099] D------N---- t99 = LCL_VAR_ADDR byref V05 loc2
/--* t99 byref
+--* t101 int
[000410] -A---------- * STORE_BLK struct<592> (init)
the new importer generates:
[000101] -A---------- * STORE_BLK struct<592> (init)
[000098] ------------ +--* LCL_VAR_ADDR byref V05 loc2
[000100] ------------ \--* CNS_INT int 0
- VectorMathTests.Program:Main(System.String[]):int (MethodHash=2c61fab2) (from AbsSqrt_r.dll)
IL:
[ 0] 1 (0x001) ldloca.s 0
[ 1] 3 (0x003) ldc.r4 11.000000000000000
[ 2] 8 (0x008) ldc.r4 13.000000000000000
[ 3] 13 (0x00d) ldc.r4 8.0000000000000000
[ 4] 18 (0x012) ldc.r4 4.0000000000000000
[ 5] 23 (0x017) call 0A000004
old import:
[000014] -A--G------- * ASG simd16 (copy)
[000013] ----G--N---- +--* BLK simd16<16>
[000003] ------------ | \--* ADDR byref
[000002] ------------ | \--* LCL_VAR simd16<System.Numerics.Vector4> V01 loc0
[000012] -------N---- \--* SIMD simd16 float initN
[000011] ------------ \--* LIST float
[000004] ------------ +--* CNS_DBL float 11.000000000000000
[000010] ------------ \--* LIST float
[000005] ------------ +--* CNS_DBL float 13.000000000000000
[000009] ------------ \--* LIST float
[000006] ------------ +--* CNS_DBL float 8.0000000000000000
[000008] ------------ \--* LIST float
[000007] ------------ \--* CNS_DBL float 4.0000000000000000
after rationalize:
N001 ( 3, 4) [000004] ------------ t4 = CNS_DBL float 11.000000000000000
N002 ( 3, 4) [000005] ------------ t5 = CNS_DBL float 13.000000000000000
N003 ( 3, 4) [000006] ------------ t6 = CNS_DBL float 8.0000000000000000
N004 ( 3, 4) [000007] ------------ t7 = CNS_DBL float 4.0000000000000000
/--* t4 float
+--* t5 float
+--* t6 float
+--* t7 float
N009 ( 16, 20) [000012] -------N---- t12 = * SIMD simd16 float initN
/--* t12 simd16
N011 ( 20, 23) [000014] DA--G------- * STORE_LCL_VAR simd16<System.Numerics.Vector4> V01 loc0
new import:
[000013] DA---------- * STORE_LCL_VAR simd16<System.Numerics.Vector4> V01 loc0
[000012] -------N---- * SIMD simd16 float initN
[000011] ------------ \--* LIST float
[000004] ------------ +--* CNS_DBL float 11.000000000000000
[000010] ------------ \--* LIST float
[000005] ------------ +--* CNS_DBL float 13.000000000000000
[000009] ------------ \--* LIST float
[000006] ------------ +--* CNS_DBL float 8.0000000000000000
[000008] ------------ \--* LIST float
[000007] ------------ \--* CNS_DBL float 4.0000000000000000
category:cq
theme:structs
skill-level:expert
cost:large