-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ESIMD] Add set_kernel_properties API and use_double_grf property. #6182
Conversation
@gmlueck, could you please review the interface part - https://github.com/intel/llvm/pull/6182/files#diff-7efdaf033502de5f8cc1ae48436f1f8b86b3b6f6ee5a6484c619053cb4753207 (review of other parts would also be appreciated). |
// 3) This code (or the code in FE) must verify that slm_init or other such | ||
// intrinsic is not called from another module because kernels in that other | ||
// module would not get updated meta data attributes. | ||
struct UpdateUint64MetaDataToMaxValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactors the original updateGenXMDNodes
:
- Call graph traversal is factored out into
traverseCallgraphUp
above. This functor represents call graph action. - It is slightly optimized to pre-select candidate nodes for fewer actions in the node action function.
This patch: 1) Adds esimd::set_kernel_properties API with the single supported property esimd::kernel_properties::use_double_grf, which lets compiler know that the calling kernel needs run in "double GRF" mode - more registers per thread at the expense of fewer H/W threads. This is temporary API until generic SYCL support for kernel properties is implemented: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc 2) Provides "lowering" of this API by the LowerESIMD.cpp, which marks such kernels with "esimd-double-grf" function attribute. 3) Implements new "dimension" of device code splitting in sycl-post-link: functions with and without "esimd-double-grf" attribute go to different modules. Device binary images resulting from "double-grf" modules are assigned the "isDoubleGRFEsimdImage" property 4) Updates runtime to add "-doubleGRF" option when JITting SPIRV binaries with the "isDoubleGRFEsimdImage" property. 5) Fixes sycl-post-link bug in ModuleSplitter.cpp:extractSubModule, where Function objects in the entry point list were not replaced with new Function objects in the cloned Module. This lead to corrupted symbol file in some cases. Signed-off-by: Konstantin S Bobrovsky <konstantin.s.bobrovsky@intel.com>
// https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc | ||
|
||
template <class... KernelProps> | ||
void set_kernel_properties(KernelProps... props) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no extension document for this API, I assume it is an unsupported API that we can remove without any deprecation period. Is that your intent?
What is the __ESIMD_ENS
namespace? Will users know that APIs in this namespace are unsupported? Should we instead define the API in a namespace named experimental
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no extension document for this API, I assume it is an unsupported API that we can remove without any deprecation period. Is that your intent?
We usually allow some deprecation period for ESIMD APIs being removed for users to adapt, even though it is experimental.
What is the __ESIMD_ENS namespace?
this is an alias for sycl::ext::intel::experimental::esimd
Will users know that APIs in this namespace are unsupported?
This is a good question. The main source of information about particular APIs for users should be the API docs doxygen https://intel.github.io/llvm-docs/doxygen/group__sycl__esimd.html. But it turns out it does not expand aliases, I we need to to replace aliases with normal nested namespace declaration style. Thanks for bringing this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module_split::ModuleDesc ResMDesc = MSplit->nextSplit(); | ||
Module &ResM = ResMDesc.getModule(); | ||
|
||
bool SpecConstsMet = processSpecConstants(ResM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part (lines 575-624) is now nested inside iteration over "double GRF" split w/o modifications (only renaming variables) - new lines 592-643
template <class... KernelProps> | ||
void set_kernel_properties(KernelProps... props) { | ||
// TODO check for duplicates | ||
using Props = __MP11_NS::mp_list<KernelProps...>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the first time we use boost to implement something in ESIMD?
If Yes, then it introduces the dependency on it, which probably, is not good without a good reason. IMO, it would be better to re-write this code without boost to not introduce that dependency for such simple thing as walk through variadic pack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boost was imported quite a while ago to facilitate invoke_simd
implementation and avoid spending resources on what boost already does. This is believed to be major productivity enhancement for SYCL API development. So this is first time for ESIMD, but not the first time for SYCL, so no new dependence is introduced. Note that boosl::mp11
appears as sycl::detail::boost::mp11
in SYCL API sources (it is imported and refactored automatically during the build)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #5791
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to what @v-klochkov suggested. Pure C++ solution looks simple enough: https://godbolt.org/z/fxv4jxPbb and if doesn't satisfy what needs to be done, then a comment explaining what and why would be very desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think pure C++ solution should be chosen over mp11-based one just because it does not use mp11. Code clarity and maintainability is more important here, and mp11 makes it better - list
+ for_each
is the most clear possible, yet adding new features to the code like duplicate detection would be way easier with mp11 as well.
In general, I believe it is much more practical and safer to use mp11 for all routine tasks where SYCL headers conceptually use templates as usual data structures as in this case, even though in some cases simpler C++ is possible.
Adding @rolandschulz for possible opinion on mp11 usage direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I believe this PR will create conflicts with the other one that splits SYCL and ESIMD call-graphs, and thus require conflicts resolution + re-review/approval after that.
sycl/include/sycl/ext/intel/experimental/esimd/kernel_properties.hpp
Outdated
Show resolved
Hide resolved
@asudarsa, @v-klochkov, @againull - please review |
Signed-off-by: Konstantin S Bobrovsky <konstantin.s.bobrovsky@intel.com>
@v-klochkov, @asudarsa, @againull - please review |
No major issues. But it will be great if comments can be addressed. thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very sorry for delayed review:
sycl/source/detail/program_manager/program_manager.cpp looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl-post-link and other tools related changes look ok to me. Thanks
This change extends Konst's work from #6182 to work for any SYCL kernel, not just ESIMD kernels Basic summary of changes: 1) Move SYCL library set_kernel_properties function and related detail header out of esimd code into generic SYCL code 2) Generalize SYCLLowerESIMDKernelPropsPass to make it work for SYCL kernels 3) Change sycl-post-link module splitting to split non-ESIMD modules that have any number of double GRF kernels 4) Change program loader to add the "-ze-opt-large-register-file" option if the double GRF property is set. We do this instead of -doubleGRF because -doubleGRF only works for the VC backend, while -ze-opt-large-register-file works for both VC and scalar backends Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This patch:
esimd::kernel_properties::use_double_grf, which lets compiler know that
the calling kernel needs run in "double GRF" mode - more registers per
thread at the expense of fewer H/W threads.
This is temporary API until generic SYCL support for kernel properties
is implemented:
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_kernel_properties.asciidoc
kernels with "esimd-double-grf" function attribute, and invoke it from the sycl-post-link as a part
of ESIMD lowering.
functions with and without "esimd-double-grf" attribute go to different
modules. Device binary images resulting from "double-grf" modules are
assigned the "isDoubleGRFEsimdImage" property
with the "isDoubleGRFEsimdImage" property.
Function objects in the entry point list were not replaced with new
Function objects in the cloned Module. This lead to corrupted symbol file in
some cases.
AOT compilation support is TBD.
Complementary E2E test PR intel/llvm-test-suite#1033
Signed-off-by: Konstantin S Bobrovsky konstantin.s.bobrovsky@intel.com