-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Bgemm for arm64 #5287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
taoye9
wants to merge
9
commits into
OpenMathLib:develop
Choose a base branch
from
taoye9:bgemm_for_arm64
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Bgemm for arm64 #5287
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
2ef36a1
add .c and .h files for bgemm interface
taoye9 abe9d38
add generic bgemm kernel and its test file
taoye9 1eb0815
support mutithreaded bgemm interface
taoye9 4d0fd12
support dynamic arch of bgemm interface
taoye9 59d0cf4
fix generic gemm_beta for bgemm
taoye9 082a9d2
Resolve symbol conflicts when building sbgemm and bgemm together
taoye9 63ce52e
change data type of bgemm alpha and beta from bfloat16 to fp32 and ad…
taoye9 5d16517
add neoversev1 bgemm kernels
taoye9 45aa27b
update init value of bgemm testcase
taoye9 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
/*************************************************************************** | ||
* Copyright (c) 2025, The OpenBLAS Project | ||
* All rights reserved. | ||
* Redistribution and use in source and binary forms, with or without | ||
* modification, are permitted provided that the following conditions are | ||
* met: | ||
* 1. Redistributions of source code must retain the above copyright | ||
* notice, this list of conditions and the following disclaimer. | ||
* 2. Redistributions in binary form must reproduce the above copyright | ||
* notice, this list of conditions and the following disclaimer in | ||
* the documentation and/or other materials provided with the | ||
* distribution. | ||
* 3. Neither the name of the OpenBLAS project nor the names of | ||
* its contributors may be used to endorse or promote products | ||
* derived from this software without specific prior written permission. | ||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | ||
* ARE DISCLAIMED. IN NO EVENT SHALL THE OPENBLAS PROJECT OR CONTRIBUTORS BE | ||
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR | ||
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF | ||
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS | ||
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN | ||
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) | ||
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | ||
* POSSIBILITY OF SUCH DAMAGE. | ||
* *****************************************************************************/ | ||
|
||
#ifndef COMMON_B_H | ||
#define COMMON_B_H | ||
|
||
// for now, only support DYNAMIC_ARCH = 0 case. | ||
#ifndef DYNAMIC_ARCH | ||
#define BGEMM_ONCOPY bgemm_oncopy | ||
#define BGEMM_OTCOPY bgemm_otcopy | ||
#define BGEMM_INCOPY bgemm_incopy | ||
#define BGEMM_ITCOPY bgemm_itcopy | ||
|
||
#define BGEMM_BETA bgemm_beta | ||
#define BGEMM_KERNEL bgemm_kernel | ||
|
||
#else | ||
|
||
#define BGEMM_ONCOPY gotoblas -> bgemm_oncopy | ||
#define BGEMM_OTCOPY gotoblas -> bgemm_otcopy | ||
#define BGEMM_INCOPY gotoblas -> bgemm_incopy | ||
#define BGEMM_ITCOPY gotoblas -> bgemm_itcopy | ||
#define BGEMM_BETA gotoblas -> bgemm_beta | ||
#define BGEMM_KERNEL gotoblas -> bgemm_kernel | ||
|
||
#endif | ||
|
||
#define BGEMM_NN bgemm_nn | ||
#define BGEMM_CN bgemm_tn | ||
#define BGEMM_TN bgemm_tn | ||
#define BGEMM_NC bgemm_nt | ||
#define BGEMM_NT bgemm_nt | ||
#define BGEMM_CC bgemm_tt | ||
#define BGEMM_CT bgemm_tt | ||
#define BGEMM_TC bgemm_tt | ||
#define BGEMM_TT bgemm_tt | ||
#define BGEMM_NR bgemm_nn | ||
#define BGEMM_TR bgemm_tn | ||
#define BGEMM_CR bgemm_tn | ||
#define BGEMM_RN bgemm_nn | ||
#define BGEMM_RT bgemm_nt | ||
#define BGEMM_RC bgemm_nt | ||
#define BGEMM_RR bgemm_nn | ||
|
||
#define BGEMM_THREAD_NN bgemm_thread_nn | ||
#define BGEMM_THREAD_CN bgemm_thread_tn | ||
#define BGEMM_THREAD_TN bgemm_thread_tn | ||
#define BGEMM_THREAD_NC bgemm_thread_nt | ||
#define BGEMM_THREAD_NT bgemm_thread_nt | ||
#define BGEMM_THREAD_CC bgemm_thread_tt | ||
#define BGEMM_THREAD_CT bgemm_thread_tt | ||
#define BGEMM_THREAD_TC bgemm_thread_tt | ||
#define BGEMM_THREAD_TT bgemm_thread_tt | ||
#define BGEMM_THREAD_NR bgemm_thread_nn | ||
#define BGEMM_THREAD_TR bgemm_thread_tn | ||
#define BGEMM_THREAD_CR bgemm_thread_tn | ||
#define BGEMM_THREAD_RN bgemm_thread_nn | ||
#define BGEMM_THREAD_RT bgemm_thread_nt | ||
#define BGEMM_THREAD_RC bgemm_thread_nt | ||
#define BGEMM_THREAD_RR bgemm_thread_nn | ||
#endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need to introduce BFLOAT16_ONLY build flag.
The type of FLOAT is the only difference. Can we simplyl use XFLOAT instead of FLOAT for C matrix, for example, in kernel/arm64/bgemm_beta.c ?