Closed
Description
Evaluate to see how feasible it would be to combine loads of subsequent fields using ldp
instead of loading them separately during the use. It cannot be considered as a peep-hole optimization, but an analysis is needed in earlier phases to check around for consecutive field loads and if found one, combine them to a single load.
class Body { public double x, y, z, vx, vy, vz, mass; }
...
foreach (var b in bodies) {
b.x += dt * b.vx; b.y += dt * b.vy; b.z += dt * b.vz;
}
Below code is generated for the loop that deals with multiplication of double
.
#1
can be combined intoldp dX, dY, [x3, #8]
#2
can be combined intoldp dX, dY, [x3, #32]
#3
can be combined intostp dX, dY, [x3, #8]
G_M56457_IG05: ;; offset=0114H
D37D7C43 ubfiz x3, x2, #3, #32
91004063 add x3, x3, #16
F8636803 ldr x3, [x0, x3]
FD400470 ldr d16, [x3,#8] ; <-- #1
FD401071 ldr d17, [x3,#32] ; <-- #2
1E710811 fmul d17, d0, d17
1E712A10 fadd d16, d16, d17
FD000470 str d16, [x3,#8] ; <-- #3
FD400870 ldr d16, [x3,#16] ; <-- #1
FD401471 ldr d17, [x3,#40] ; <-- #2
1E710811 fmul d17, d0, d17
1E712A10 fadd d16, d16, d17
FD000870 str d16, [x3,#16] ; <-- #3
FD400C70 ldr d16, [x3,#24]
FD401871 ldr d17, [x3,#48]
1E710811 fmul d17, d0, d17
1E712A10 fadd d16, d16, d17
FD000C70 str d16, [x3,#24]
11000442 add w2, w2, #1
6B02003F cmp w1, w2
54FFFD8C bgt G_M56457_IG05
Reference: https://godbolt.org/z/9jY5hYnoa
category:implementation
theme:codegen
skill-level:intermediate
cost:medium
impact:medium