-
Notifications
You must be signed in to change notification settings - Fork 196
Create a learning path for Function Multiversioning. #1196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@pareenaverma @jroelofs can you take a look? Thanks |
content/learning-paths/smartphones-and-mobile/function-multiversioning/_index.md
Outdated
Show resolved
Hide resolved
Can I implement versions of a function in separate translation units? | ||
answers: | ||
- Yes, function versions can spread across different translations units. | ||
- No, all of them must be in the same translation unit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did that for clang, but (and I don't know if you care...) did gcc get the same feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do care about GCC, this learning path is for both toolchains. We made this design decision part of ACLE so GCC will have to adhere to it. Adding @andrewcarlotti for more visibility.
content/learning-paths/smartphones-and-mobile/function-multiversioning/examples.md
Outdated
Show resolved
Hide resolved
|
||
#### Differences between GCC 14 and LLVM 19 implementations | ||
|
||
- The attribute `target_version` in GCC is only supported for C++, not for C. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also want to mention target_clones
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, target_clones is supported for C as well in gcc-14.
...ent/learning-paths/smartphones-and-mobile/function-multiversioning/implementation-details.md
Outdated
Show resolved
Hide resolved
|
||
#### Static resolution with GCC | ||
|
||
The GCC compiler optimizes calls to versioned functions when they can be statically resolved. Such calls would otherwise be routed through the resolver but instead they become direct which allows them to be inlined. This may be possible whenever a function is compiled with a sufficiently high set of architecture features (so including `target`/`target_version`/`target_clones` attributes, and command line options). LLVM is not yet able to perform this optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM is not yet able to perform this optimization.
We did for a few cases at one point. Did that get reverted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean this llvm/llvm-project#80606 ? Not reverted, but also not quite the same as GCC. Also, after llvm/llvm-project#99522 I don't think that the clang frontend can generate a trivial resolver anymore (which was the only case llvm knows how to optimise).
* Fixed upper half of box in comment * Renamed foo to sumPosEltsScaledByIndex in examples * Added missing comma in resolver's emission section
I'll start the editorial review. Thanks for submitting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of small nits that you might want to improve, but otherwise looks good to me.
## Semantics | ||
Function Multiversioning allows compiling several implementations of a function into the same binary and then selecting the most appropriate version at runtime. The intention is to take advantage of hardware features for accelerating your application at function level granularity. | ||
|
||
To specify a function version you can annotate its declaration with either of `__attribute__((target_version("name")))` or `__attribute__((target_clones("name",...)))`, where "name" denotes one or more architectural features separated by '+'. This annotation implies a dependency between a function version and the feature set it is specified for. The compiler generates optimized versions of the function for the specified targets. The `target_clones` attribute behaves just like `target_version` but specifies multiple versions for the same function definition. The former is perhaps suitable for functions that the compiler can optimize differently depending on the requested features, whereas the latter allows the user to manually optimize a version at the source level using intrinsics or inline assembly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the last sentence, it's unclear whether "former" and "latter" refer to the ordering in the previous sentence of the ordering at the start of the paragraph.
You could append 'or the string "default"
' to make it clearer how the "default" version is specified.
To specify a function version you can annotate its declaration with either of `__attribute__((target_version("name")))` or `__attribute__((target_clones("name",...)))`, where "name" denotes one or more architectural features separated by '+'. This annotation implies a dependency between a function version and the feature set it is specified for. The compiler generates optimized versions of the function for the specified targets. The `target_clones` attribute behaves just like `target_version` but specifies multiple versions for the same function definition. The former is perhaps suitable for functions that the compiler can optimize differently depending on the requested features, whereas the latter allows the user to manually optimize a version at the source level using intrinsics or inline assembly. | ||
|
||
A hardware platform may support multiple architectural features from the dependency sets, or it may not support any. Therefore Function Multiversioning provides a convenient way to select the most appropriate version of a function at runtime. The selection is permanent for the lifetime of the process and works as follows: | ||
- Select the most specific version (the one with most features), else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description is imprecise, and we're planning to clarify and slightly modify the prioritization criteria anyway.
Before submitting a pull request for a new Learning Path, please review Create a Learning Path
Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information. No AI tool can be used to generate either content or code when creating a learning path or install guide.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.