Skip to content

Conversation

@MacDue
Copy link
Member

@MacDue MacDue commented Jan 30, 2026

This adds more conditional scalar assignment benchmarks for cases where the assignment can't be speculated (resulting in an extra block + phis rather than selects in the IR).

This patch also reworks the macros to make adding new test cases easier and less repetitive.

…lar assignment autovec

This adds more conditional scalar assignment benchmarks for cases where
the assignment can't be speculated (resulting in an extra block + phis
rather than selects in the IR).

This patch also reworks the macros to make adding new test cases easier
and less repetitive.
Copy link
Contributor

@huntergr-arm huntergr-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

Comment on lines +33 to +35
_Pragma("clang loop vectorize(disable) interleave(disable)") \
\
for (unsigned i = 0; i < ITERATIONS; i++) body; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
_Pragma("clang loop vectorize(disable) interleave(disable)") \
\
for (unsigned i = 0; i < ITERATIONS; i++) body; \
_Pragma("clang loop vectorize(disable) interleave(disable)") \
for (unsigned i = 0; i < ITERATIONS; i++) body; \

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that newline as clang-format wants to do:

    _Pragma(                                                                   \
        "clang loop vectorize(disable) interleave(disable)") for (unsigned i = \
                                                                      0;       \
                                                                  i <          \
                                                                  ITERATIONS;  \
                                                                  i++) body;   \

without it.

Comment on lines +47 to +50
DEF_SINGLE_CSA_LOOP(single_csa_only, LOOP_BLOCK({
if (A[i] > Threshold)
Result = A[i];
}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it add much more code if we pass the whole loop to the macro instead? That would make it easier to see what is actually benchmarked.

Copy link
Member Author

@MacDue MacDue Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, but it's harder to pass to a macro in a way that won't be badly formatted (so needs clang-format off/on). Since you can't wrap the for {} in a block (as the pragma must be directly placed before the for statement).

That's why I added LOOP_BLOCK() as a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants