-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM CPU] Add ACL executor for Transpose #17322
[ARM CPU] Add ACL executor for Transpose #17322
Conversation
allnes
commented
May 2, 2023
•
edited
Loading
edited
- Separate executors
- Add ACL executor for transpose
This PR will be closed in a week because of 2 weeks of no activity. |
…into an/init_acl_transpose
…nit_acl_transpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please double check the possible regressions with dlb
@@ -253,18 +273,25 @@ void Transpose::createPrimitive() { | |||
} else if (getParentEdgeAt(INPUT_DATA_IDX)->getMemory().getDesc().hasLayoutType(LayoutType::ncsp) && | |||
std::find(optimizedOrders.begin(), optimizedOrders.end(), order) != optimizedOrders.end()) { | |||
isOptimized = true; | |||
execPtr = std::make_shared<TransposeRefExecutor>(); | |||
transposeParams.transposeExecution = TransposeParams::REF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the factory is supposed to decide which executor to use / to force to use.
The idea is to avoid making such type of decisions in scope of Nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed to fix in separate PR
src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/acl/acl_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_transpose.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_transpose.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_transpose.hpp
Outdated
Show resolved
Hide resolved
Please do not merge the PR - I'd like to discuss the sequence of executor builders. |
It's better to use |
It's ready to merge as soon as CI is completed. |
src/plugins/intel_cpu/src/nodes/executors/x64/jit_transpose.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/x64/jit_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/common/ref_opt_transpose.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/acl/acl_transpose.hpp
Outdated
Show resolved
Hide resolved
@alvoron the PR is ready to be merged, please make CI green. |
* separate executors + add acl executor fot transpose * correct axisCast * update transpose executors list * update new changes * enable tests * fix fortting * fixed test shapes and transpose generalization * fixed different signedness error * size_t usage in loop counters * undo unwanted changes * fixed comments * added i8 and fp32 to blocked x86 tests * fixed comments * fixed comments * extracted general reference executor from PermuteKernel * fix mayiuse in JitTransposeExecutorBuilder::isSupported * getDescWithType name refactoring * refactoring * removed 2nd executor creation in transpose node * Moved RefOptimizedTranspose to the top * fixed comments --------- Co-authored-by: Aleksandr Voron <aleksandr.voron@intel.com>