-
Notifications
You must be signed in to change notification settings - Fork 389
[Microbenchmarks] Add benchmarks for non-speculatable conditional scalar assignment autovec #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…lar assignment autovec This adds more conditional scalar assignment benchmarks for cases where the assignment can't be speculated (resulting in an extra block + phis rather than selects in the IR). This patch also reworks the macros to make adding new test cases easier and less repetitive.
huntergr-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
| _Pragma("clang loop vectorize(disable) interleave(disable)") \ | ||
| \ | ||
| for (unsigned i = 0; i < ITERATIONS; i++) body; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| _Pragma("clang loop vectorize(disable) interleave(disable)") \ | |
| \ | |
| for (unsigned i = 0; i < ITERATIONS; i++) body; \ | |
| _Pragma("clang loop vectorize(disable) interleave(disable)") \ | |
| for (unsigned i = 0; i < ITERATIONS; i++) body; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that newline as clang-format wants to do:
_Pragma( \
"clang loop vectorize(disable) interleave(disable)") for (unsigned i = \
0; \
i < \
ITERATIONS; \
i++) body; \without it.
| DEF_SINGLE_CSA_LOOP(single_csa_only, LOOP_BLOCK({ | ||
| if (A[i] > Threshold) | ||
| Result = A[i]; | ||
| })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it add much more code if we pass the whole loop to the macro instead? That would make it easier to see what is actually benchmarked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really, but it's harder to pass to a macro in a way that won't be badly formatted (so needs clang-format off/on). Since you can't wrap the for {} in a block (as the pragma must be directly placed before the for statement).
That's why I added LOOP_BLOCK() as a comment.
This adds more conditional scalar assignment benchmarks for cases where the assignment can't be speculated (resulting in an extra block + phis rather than selects in the IR).
This patch also reworks the macros to make adding new test cases easier and less repetitive.