-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ggml : build backends as libraries #10256
Conversation
28b3b76
to
0cdecd3
Compare
0cdecd3
to
ab26fb9
Compare
8cd434c
to
646e91a
Compare
bac7868
to
710822f
Compare
db2cb04
to
45f7dc4
Compare
45f7dc4
to
c8da7d0
Compare
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Is this caused by this commit by any chance?
|
@slaren I see that you removed |
* ggml : build backends as libraries --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com> build passed
Hi @slaren , I don't see the rules required to create |
I'll add them now. But it's better to start using the CMake build since the Makefile will be removed at some point. |
Thanks. Understandable. I hope that solutions to building on more esoteric environments like Termux/w64devkit/old linux/macOS systems that do not have cmake readily available can be found. |
Hi,
Relative path to Edit: Perhaps I need to use the |
Yes, the |
* https://github.com/ggerganov/llama.cpp/commits/db4cfd5dbc31c90f0d5c413a2e182d068b8ee308 * build.rs & bindgen includes updates to reflect backend refactor - ggerganov/llama.cpp#10256
* ggml : build backends as libraries --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com> test passed
* ggml : build backends as libraries --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
Cannot build with VS 2022 (admin dev prompt) for CUDA anymore and I think it is this change, I have all the permissions and dir/subdirs have full control permissions to all users. Was compiling up until right after Daisyui server revamp ~2 weeks ago. The dll export fails
CMAKE configuration output (working up until ~2weeks ago no problem)
|
I have no issues building for CUDA with VS 2022. |
Hi Slaren, I've taken a look at this again and the problem was I'm sorry if any of this sounds kinda generic or imprecise, I'm not a professional engineer. Kudos |
I don't think LTO makes much difference for ggml, everything that should be inlined is already defined in the same translation unit. I will take a look at this when I have the chance, but you should not lose anything by just disabling LTO. |
GitHub: fix pgroonga/pgroonga#642 It can build backends as libraries: ggerganov/llama.cpp#10256 The current bundled llama.cpp uses some AVX operations in static variables. So we can't load libgroonga.so on CPU without AVX. With the backends as libraries feature, we can really lazy AVX operations. Reported by Yuki Shira. Thanks!!!
Moves each backend to a different directory with its own build script. The ggml library is split into the target
ggml-base
that only includes the core ggml elements, andggml
that bundlesggml-base
and all the backends included in the build.To completely separate the build of the CPU backend,
ggml-quants.c
andggml-aarch64.c
have been split such as the reference quantization and dequantization functions are inggml-base
, and the optimized quantization and dot product functions are inggml-cpu
.The build is organized as such:
Currently, ggml needs to be linked to the backend libraries, but ultimately the goal is to load the backends dynamically at runtime, so that we can distribute a single llama.cpp package that includes all the backends, as well as multiple versions of the CPU backend compiled with different instruction sets.
Breaking changes
Applications that use ggml and llama.cpp should not require any changes, they only need to link to the ggml and llama targets as usual. However, when building with
BUILD_SHARED_LIBS
, additional shared libraries are produced that need to be bundled with the application: in addition tollama
andggml
,ggml-base
,ggml-cpu
and the any other backends included in the build should be added to the application package.GGML_HIPBLAS
toGGML_HIP
, in line with a previous change to the CUDA backend