Build Llama.cpp with Vulkan for Android Device (Magic Leap 2). #8874

XinyuGroceryStore · 2024-08-05T13:41:23Z

XinyuGroceryStore
Aug 5, 2024

I succeeded in build llama.cpp for Magic Leap 2 by following the instructions of building on Android. Magic Leap 2 is an Android Device with x86-64 CPU. Commands below:

cmake -G "Ninja" ^
    -DCMAKE_TOOLCHAIN_FILE=%NDK%/build/cmake/android.toolchain.cmake ^
    -DANDROID_ABI=x86_64 ^
    -DANDROID_PLATFORM=android-23 ^
    -DCMAKE_C_FLAGS=-march=x86-64 ..

Then I ran ninja to generate binary files.
After that, I used Android Debug Bridge (ADB) to copy .so and binary files to Magic Leap 2 and use adb shell to execute llama-cli.
All of the above steps are feasible and work well.

Now I plan to accelerate by ML's GPU by Vulkan. So I ran commands below:

set VULKAN_SDK=C:/VulkanSDK/1.3.283.0
cmake -G "Ninja" ^
    -DCMAKE_TOOLCHAIN_FILE=%NDK%/build/cmake/android.toolchain.cmake ^
    -DANDROID_ABI=x86_64 ^
    -DANDROID_PLATFORM=android-23 ^
    -DCMAKE_C_FLAGS=-march=x86-64 ^
    -DGGML_VULKAN=1 ^
    -DVulkan_INCLUDE_DIR=%VULKAN_SDK%/include ^
    -DVulkan_LIBRARY=%VULKAN_SDK%/Lib/vulkan-1.lib ..

Log shown below:

Details

-- The C compiler identification is Clang 18.0.1
-- The CXX compiler identification is Clang 18.0.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Users/Xinyu/AppData/Local/Android/Sdk/ndk/27.0.12077973/toolchains/llvm/prebuilt/windows-x86_64/bin/clang.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Users/Xinyu/AppData/Local/Android/Sdk/ndk/27.0.12077973/toolchains/llvm/prebuilt/windows-x86_64/bin/clang++.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Users/Xinyu/Git/cmd/git.exe (found version "2.45.2.windows.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp=libomp
-- Found OpenMP_CXX: -fopenmp=libomp
-- Found OpenMP: TRUE
-- OpenMP found
-- Using llamafile
-- Found Vulkan: C:/VulkanSDK/1.3.283.0/Lib/vulkan-1.lib (found version "1.3.283") found components: glslc glslangValidator
-- Vulkan found
-- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Configuring done (6.1s)
-- Generating done (0.3s)
-- Build files have been written to: C:/Users/Xinyu/llama-adb-vulkan/llama.cpp/build

and I ran ninja. The build stopped with reason below:

If you have any ideas about how to build this project, please share with me. And any suggestions and ideas on how to accelerate inference with GPU on Magic Leap are welcomed.

FranzKafkaYu · 2024-08-14T08:22:45Z

FranzKafkaYu
Aug 14, 2024

@XinyuGroceryStore I have built this program successfully for Android,and run it in my Android Phone,here are two key points:
1.compile vulkan-shader-gen for your host,and then add vulkan-shader-gen output dir to PATH
2.update Android NDK vulkan headers when you build for Android

but there still are other problems,in QUALCOMM Adreno GPU the program will crash，in ARM Mali GPU the speed is slower even than CPU.

hope my answer can help you.

9 replies

FranzKafkaYu Aug 19, 2024

@XinyuGroceryStore 不是，我用的是原生Ubuntu 22.04,你有事先安装好Vulkan SDK吗

embedsri Dec 19, 2024

@XinyuGroceryStore I have built this program successfully for Android,and run it in my Android Phone,here are two key points: 1.compile vulkan-shader-gen for your host,and then add vulkan-shader-gen output dir to PATH 2.update Android NDK vulkan headers when you build for Android

but there still are other problems,in QUALCOMM Adreno GPU the program will crash，in ARM Mali GPU the speed is slower even than CPU.

hope my answer can help you.

Hello, I'm trying to build llama.cpp. for android on macOS but am unable to do your step 1. I have installed Vulkan SDK and it gets picked up by cmake:

 cmake \                                              
  -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
  -DANDROID_ABI=arm64-v8a \
  -DANDROID_PLATFORM=android-28 \
  -DCMAKE_C_FLAGS="-march=armv8.7a" \
  -DCMAKE_CXX_FLAGS="-march=armv8.7a" \
  -DGGML_OPENMP=OFF \
  -DGGML_LLAMAFILE=OFF \
  -B build-android -DGGML_VULKAN=1
CMake Deprecation Warning at /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/android.toolchain.cmake:35 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
Call Stack (most recent call first):
  build-android/CMakeFiles/3.31.1/CMakeSystem.cmake:6 (include)
  CMakeLists.txt:2 (project)


CMake Deprecation Warning at /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/android-legacy.toolchain.cmake:35 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
Call Stack (most recent call first):
  /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/android.toolchain.cmake:55 (include)
  build-android/CMakeFiles/3.31.1/CMakeSystem.cmake:6 (include)
  CMakeLists.txt:2 (project)


CMake Deprecation Warning at /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/flags.cmake:18 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
Call Stack (most recent call first):
  /opt/homebrew/share/cmake/Modules/Platform/Android-Clang.cmake:23 (include)
  /opt/homebrew/share/cmake/Modules/Platform/Android-Clang-C.cmake:1 (include)
  /opt/homebrew/share/cmake/Modules/CMakeCInformation.cmake:48 (include)
  CMakeLists.txt:2 (project)


-- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: aarch64
-- Including CPU backend
-- ARM detected
-- Adding CPU backend variant ggml-cpu:  
**-- Vulkan found**
-- GL_NV_cooperative_matrix2 not supported by glslc
**-- Including Vulkan backend**
-- Configuring done (0.4s)
-- Generating done (0.4s)
-- Build files have been written to: /Users/suc/SarboAstro/llama.cpp.qwen2vl/build-android
suc@MacBookPro llama.cpp.qwen2vl % cd build-android 
suc@MacBookPro build-android % ls
CMakeCache.txt		DartConfiguration.tcl	bin			compile_commands.json	llama-config.cmake	pocs
CMakeFiles		Makefile		cmake_install.cmake	examples		llama-version.cmake	src
CTestTestfile.cmake	Testing			common			ggml			llama.pc		tests
suc@MacBookPro build-android % make
[  0%] Linking CXX shared library libggml-base.so
[  4%] Built target ggml-base
[  5%] Linking CXX executable ../../../../bin/vulkan-shaders-gen
[  5%] Built target vulkan-shaders-gen
[  6%] Generate vulkan shaders
/bin/sh: vulkan-shaders-gen: command not found
make[2]: *** [ggml/src/ggml-vulkan/ggml-vulkan-shaders.hpp] Error 127
make[1]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/all] Error 2
make: *** [all] Error 2
suc@MacBookPro build-android % make clean
suc@MacBookPro build-android % make
[  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[  2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[  3%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[  3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[  4%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[  4%] Linking CXX shared library libggml-base.so
[  4%] Built target ggml-base
[  4%] Building CXX object ggml/src/ggml-vulkan/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o
[  5%] Linking CXX executable ../../../../bin/vulkan-shaders-gen
[  5%] Built target vulkan-shaders-gen
[  6%] Generate vulkan shaders
/bin/sh: vulkan-shaders-gen: command not found
make[2]: *** [ggml/src/ggml-vulkan/ggml-vulkan-shaders.hpp] Error 127
make[1]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/all] Error 2
make: *** [all] Error 2

What am I missing here?

FranzKafkaYu Dec 20, 2024

@XinyuGroceryStore I have built this program successfully for Android,and run it in my Android Phone,here are two key points: 1.compile vulkan-shader-gen for your host,and then add vulkan-shader-gen output dir to PATH 2.update Android NDK vulkan headers when you build for Android
but there still are other problems,in QUALCOMM Adreno GPU the program will crash，in ARM Mali GPU the speed is slower even than CPU.
hope my answer can help you.

Hello, I'm trying to build llama.cpp. for android on macOS but am unable to do your step 1. I have installed Vulkan SDK and it gets picked up by cmake:

 cmake \                                              
  -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
  -DANDROID_ABI=arm64-v8a \
  -DANDROID_PLATFORM=android-28 \
  -DCMAKE_C_FLAGS="-march=armv8.7a" \
  -DCMAKE_CXX_FLAGS="-march=armv8.7a" \
  -DGGML_OPENMP=OFF \
  -DGGML_LLAMAFILE=OFF \
  -B build-android -DGGML_VULKAN=1
CMake Deprecation Warning at /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/android.toolchain.cmake:35 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
Call Stack (most recent call first):
  build-android/CMakeFiles/3.31.1/CMakeSystem.cmake:6 (include)
  CMakeLists.txt:2 (project)


CMake Deprecation Warning at /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/android-legacy.toolchain.cmake:35 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
Call Stack (most recent call first):
  /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/android.toolchain.cmake:55 (include)
  build-android/CMakeFiles/3.31.1/CMakeSystem.cmake:6 (include)
  CMakeLists.txt:2 (project)


CMake Deprecation Warning at /Users/suc/Library/Android/sdk/ndk/28.0.12674087/build/cmake/flags.cmake:18 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
Call Stack (most recent call first):
  /opt/homebrew/share/cmake/Modules/Platform/Android-Clang.cmake:23 (include)
  /opt/homebrew/share/cmake/Modules/Platform/Android-Clang-C.cmake:1 (include)
  /opt/homebrew/share/cmake/Modules/CMakeCInformation.cmake:48 (include)
  CMakeLists.txt:2 (project)


-- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: aarch64
-- Including CPU backend
-- ARM detected
-- Adding CPU backend variant ggml-cpu:  
**-- Vulkan found**
-- GL_NV_cooperative_matrix2 not supported by glslc
**-- Including Vulkan backend**
-- Configuring done (0.4s)
-- Generating done (0.4s)
-- Build files have been written to: /Users/suc/SarboAstro/llama.cpp.qwen2vl/build-android
suc@MacBookPro llama.cpp.qwen2vl % cd build-android 
suc@MacBookPro build-android % ls
CMakeCache.txt		DartConfiguration.tcl	bin			compile_commands.json	llama-config.cmake	pocs
CMakeFiles		Makefile		cmake_install.cmake	examples		llama-version.cmake	src
CTestTestfile.cmake	Testing			common			ggml			llama.pc		tests
suc@MacBookPro build-android % make
[  0%] Linking CXX shared library libggml-base.so
[  4%] Built target ggml-base
[  5%] Linking CXX executable ../../../../bin/vulkan-shaders-gen
[  5%] Built target vulkan-shaders-gen
[  6%] Generate vulkan shaders
/bin/sh: vulkan-shaders-gen: command not found
make[2]: *** [ggml/src/ggml-vulkan/ggml-vulkan-shaders.hpp] Error 127
make[1]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/all] Error 2
make: *** [all] Error 2
suc@MacBookPro build-android % make clean
suc@MacBookPro build-android % make
[  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[  2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[  3%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[  3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[  4%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[  4%] Linking CXX shared library libggml-base.so
[  4%] Built target ggml-base
[  4%] Building CXX object ggml/src/ggml-vulkan/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o
[  5%] Linking CXX executable ../../../../bin/vulkan-shaders-gen
[  5%] Built target vulkan-shaders-gen
[  6%] Generate vulkan shaders
/bin/sh: vulkan-shaders-gen: command not found
make[2]: *** [ggml/src/ggml-vulkan/ggml-vulkan-shaders.hpp] Error 127
make[1]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/all] Error 2
make: *** [all] Error 2

What am I missing here?

In short,you need to get vulkan-shaders-gen first,please refer to my blog for more details.

embedsri Dec 20, 2024

Yes I went through your blog. It is for Ubuntu where you seem to have obtained Vulkan backend source. Where and how did you obtain the backend source separately from the SDK ? On MacOS, there is an installer wizard, like on Windows, which helped install the SDK.
Here are the binaries for macOS in the SDK:
suc@MacBookPro macOS/bin % ls
MoltenVKShaderConverter glslang slangc spirv-cross spirv-lint spirv-reflect vkvia
dxc glslangValidator slangd spirv-dis spirv-objdump spirv-reflect-pp vkvia.html
dxc-3.7 glslc spirv-as spirv-lesspipe.sh spirv-opt spirv-remap vulkaninfo
gfx.slang slang.slang spirv-cfg spirv-link spirv-reduce spirv-val
How to create the Vulkan-shaders-gen for macOS host?
When I built llama.cpp for android, it is saying it included Vulkan backend as you can see in the above log. How can I access the Vulkan backend source? I'm new to Vulkan and appreciate your help. Thank you.

FranzKafkaYu Dec 21, 2024

Yes I went through your blog. It is for Ubuntu where you seem to have obtained Vulkan backend source. Where and how did you obtain the backend source separately from the SDK ? On MacOS, there is an installer wizard, like on Windows, which helped install the SDK. Here are the binaries for macOS in the SDK: suc@MacBookPro macOS/bin % ls MoltenVKShaderConverter glslang slangc spirv-cross spirv-lint spirv-reflect vkvia dxc glslangValidator slangd spirv-dis spirv-objdump spirv-reflect-pp vkvia.html dxc-3.7 glslc spirv-as spirv-lesspipe.sh spirv-opt spirv-remap vulkaninfo gfx.slang slang.slang spirv-cfg spirv-link spirv-reduce spirv-val How to create the Vulkan-shaders-gen for macOS host? When I built llama.cpp for android, it is saying it included Vulkan backend as you can see in the above log. How can I access the Vulkan backend source? I'm new to Vulkan and appreciate your help. Thank you.

LOL,I don't know how to build on MacOS,I followed the official tutorial to get vulkan-shaders-gen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build Llama.cpp with Vulkan for Android Device (Magic Leap 2). #8874

{{title}}

Replies: 1 comment 9 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Build Llama.cpp with Vulkan for Android Device (Magic Leap 2). #8874

XinyuGroceryStore Aug 5, 2024

Replies: 1 comment · 9 replies

FranzKafkaYu Aug 14, 2024

FranzKafkaYu Aug 19, 2024

embedsri Dec 19, 2024

FranzKafkaYu Dec 20, 2024

embedsri Dec 20, 2024

FranzKafkaYu Dec 21, 2024

XinyuGroceryStore
Aug 5, 2024

Replies: 1 comment 9 replies

FranzKafkaYu
Aug 14, 2024