Closed
Description
I have been running libdivide's benchmark program a lot over the past few days and I realized that libdivide's unswitch divider does not provide any performance speedup compared to the branchfull divider for both GCC and Clang (on x64). This is because GCC & Clang are smart enough to move the branches outside the body of the loop when using the default branchfull divider. The unswitch divider only improves performance by about 20% when using the MSVC compiler.
There are many good reasons for deprecating unswitch:
- I don't think anybody uses it (I found no usages using Google).
- It has a crazy API (see here).
- Unswitch requires crazy hacks (e.g. crash divider)
- Unswitch uses a very large amount of code that slows down compile time (e.g. 66% of the C vector API are functions related to unswitch).
- By removing unswitch we make it easier to port libdivide to other languages.
- We have added the branchfree divider which is similar to unswitch.
Please let me know if you agree or have if you have any objections.
Metadata
Metadata
Assignees
Labels
No labels
Activity