forked from ggerganov/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 360
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feature : add blis and other BLAS implementation support (ggerganov#1502
) * feature: add blis support * feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927 * fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake * Fix typo in INTEGER Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
- Loading branch information
Showing
4 changed files
with
104 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
BLIS Installation Manual | ||
------------------------ | ||
|
||
BLIS is a portable software framework for high-performance BLAS-like dense linear algebra libraries. It has received awards and recognition, including the 2023 James H. Wilkinson Prize for Numerical Software and the 2020 SIAM Activity Group on Supercomputing Best Paper Prize. BLIS provides a new BLAS-like API and a compatibility layer for traditional BLAS routine calls. It offers features such as object-based API, typed API, BLAS and CBLAS compatibility layers. | ||
|
||
Project URL: https://github.com/flame/blis | ||
|
||
### Prepare: | ||
|
||
Compile BLIS: | ||
|
||
```bash | ||
git clone https://github.com/flame/blis | ||
cd blis | ||
./configure --enable-cblas -t openmp,pthreads auto | ||
# will install to /usr/local/ by default. | ||
make -j | ||
``` | ||
|
||
Install BLIS: | ||
|
||
```bash | ||
sudo make install | ||
``` | ||
|
||
We recommend using openmp since it's easier to modify the cores been used. | ||
|
||
### llama.cpp compilation | ||
|
||
Makefile: | ||
|
||
```bash | ||
make LLAMA_BLIS=1 -j | ||
# make LLAMA_BLIS=1 benchmark-matmult | ||
``` | ||
|
||
CMake: | ||
|
||
```bash | ||
mkdir build | ||
cd build | ||
cmake -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=FLAME .. | ||
make -j | ||
``` | ||
|
||
### llama.cpp execution | ||
|
||
According to the BLIS documentation, we could set the following | ||
environment variables to modify the behavior of openmp: | ||
|
||
``` | ||
export GOMP_GPU_AFFINITY="0-19" | ||
export BLIS_NUM_THREADS=14 | ||
``` | ||
|
||
And then run the binaries as normal. | ||
|
||
|
||
### Intel specific issue | ||
|
||
Some might get the error message saying that `libimf.so` cannot be found. | ||
Please follow this [stackoverflow page](https://stackoverflow.com/questions/70687930/intel-oneapi-2022-libimf-so-no-such-file-or-directory-during-openmpi-compila). | ||
|
||
### Reference: | ||
|
||
1. https://github.com/flame/blis#getting-started | ||
2. https://github.com/flame/blis/blob/master/docs/Multithreading.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters