-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llamafile : tmp disable + build sgemm.o when needed #6716
Conversation
This error is because vaddvq_f32 is only available on armv8/aarch64, but the build targets armv7 (armv7-none-linux-androideabi33). I doubt this is specific to Android, I'm sure you would hit the same error when building for a 32-bit Raspberry Pi. Aren't most Android devices armv8 these days, anyway? |
Yes, I think so. On armv7 people can always disable the code manually via |
The CI is still failing because the Android build pulls from llama.cpp master: https://github.com/ggerganov/llama.cpp/blob/8dd1ec8b3ffbfa2d26e82e672cea89f5eeb2f141/examples/llama.android/app/src/main/cpp/CMakeLists.txt#L16-L20 When CMAKE_SYSTEM_NAME is |
* build : sgemm.o only when needed ggml-ci * llamafile : tmp disable due to MoE bug ggml-ci
- Re-enable by default - Fix issue described in ggml-org#6716 - Make code more abstract, elegant, and maintainable - Faster handling of weirdly shaped `m` an `n` edge cases
- Re-enable by default - Fix issue described in ggml-org#6716 - Make code more abstract, elegant, and maintainable - Faster handling of weirdly shaped `m` an `n` edge cases
* llamafile : improve sgemm.cpp - Re-enable by default - Fix issue described in #6716 - Make code more abstract, elegant, and maintainable - Faster handling of weirdly shaped `m` an `n` edge cases * Address review comments * Help clang produce fma instructions * Address review comments
MoE ppl is currently abnormally high indicating some issue:
Need to fix this before re-enabling by default