Skip to content

Binary search performance regressed in 1.25 #53823

Closed

Description

The speedup obtained by the optimizations from #45333 which hit stable in 1.23 were mostly lost again from 1.25 onwards, because the compiler started emitting a branch instead of a cmov.

1.24

14  cmp     qword ptr [rsi + 8*rax], rcx
15  cmova   rax, r8

1.25

18  cmp     qword ptr [rsi + 8*r9], rcx
19  ja      .LBB0_7
20  mov     r8, r9
21  .LBB0_7:

From godbolt: https://godbolt.org/z/DUDl-T

Bench comparison with 1.23 nightly (2017-11-20, 9 days after PR merge)

 name                        old_1.23 ns/iter  new_1.23 ns/iter  diff ns/iter   diff %  speedup 
 binary_search_l1            70                28                         -42  -60.00%   x 2.50 
 binary_search_l1_with_dups  46                28                         -18  -39.13%   x 1.64 
 binary_search_l2            94                42                         -52  -55.32%   x 2.24 
 binary_search_l2_with_dups  68                42                         -26  -38.24%   x 1.62 
 binary_search_l3            267               234                        -33  -12.36%   x 1.14 
 binary_search_l3_with_dups  198               236                         38   19.19%   x 0.84 

and status quo with 1.30 nightly (2018-08-15)

 name                        old_1.30 ns/iter  new_1.30 ns/iter  diff ns/iter   diff %  speedup 
 binary_search_l1            75                68                          -7   -9.33%   x 1.10 
 binary_search_l1_with_dups  46                61                          15   32.61%   x 0.75 
 binary_search_l2            99                89                         -10  -10.10%   x 1.11 
 binary_search_l2_with_dups  71                89                          18   25.35%   x 0.80 
 binary_search_l3            272               245                        -27   -9.93%   x 1.11 
 binary_search_l3_with_dups  212               256                         44   20.75%   x 0.83 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.P-lowLow priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.regression-from-stable-to-stablePerformance or correctness regression from one stable version to another.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions