Sinking load instructions results in worse performance and increased dynamic instruction counts

SimplifyCFG recently gained https://github.com/llvm/llvm-project/commit/ede27d8d391e3917a5aa25be7903cabde4303a66 by @nikic.  Now, it seems to backfire in some cases:

Compile attached [bcmp.ll](https://github.com/user-attachments/files/16006666/bcmp.txt) (generated from `llvm-project/libc/src/string/bcmp.cpp`) like so:

```
$ clang -O3 -S bcmp.ll -o bcmp.s
```

Then I get:

```
Without the patch:

# %bb.26:
        movdqu  -16(%rdi,%rdx), %xmm0
        movdqu  -16(%rsi,%rdx), %xmm1
        jmp     .LBB1_34
:
:
:
# %bb.33:
        movdqu  (%rdi,%rdx), %xmm0
        movdqu  (%rsi,%rdx), %xmm1
.LBB1_34:                               # %.loopexit
```

```
With the patch:

# %bb.25:
        addq    %rdx, %rdi
        addq    $-16, %rdi
        addq    %rdx, %rsi
        addq    $-16, %rsi
        jmp     .LBB1_33
:
:
:
# %bb.32:
        addq    %rdx, %rdi
        addq    %rdx, %rsi
.LBB1_33:                               # %.loopexit.sink.split
        movdqu  (%rdi), %xmm0
        movdqu  (%rsi), %xmm1
```

Notice that the two load instructions sink just below the join point while the address calculation is left behind.  This seems to result in worse performance and increased dynamic instruction counts.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sinking load instructions results in worse performance and increased dynamic instruction counts #96838

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sinking load instructions results in worse performance and increased dynamic instruction counts #96838

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions