Accelerated computations on Android Adreno 740 #17456
Replies: 1 comment
-
|
我觉得740理论上适配,大概率是clvk转的时候有问题。可以直接使用ndk试试。我的设备是Lenovo y700 8gen3(sm8650p,GPU是adreno 750),Android16,kernel为Android14,physical memory是12gb。 我在wsl debian上使用了android-ndk-n27d进行编译llama.cpp,开启了opencl后端,参数如下。 cmake -B build 在push到手机上后可以在设备的shell里正常运行,但是效率感觉不高,可能我的操作不那么正确。 或者也可以直接使用vulkan试试,我在termux中使用turnip vulkan驱动跑通过qwen3-8b-q4_k_m,但是也是只有7左右的tokens。 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to run llama.cpp on a Pico 4 Ultra device, which comes with a Snapdragon XR2 Gen 2. Because I am on Android, I am using Termux as linux enironment, which packages llama-cpp and both the opencl and the vulkan backend.
I have tried using the vulkan backend, but I believe I have encountered the problem described in #16881, e.g. the output is (very fast) gibberish.
OpenCL on the other hand does not even activate, because apparently the device does not support some extension llama.cpp needs.
This is the output of
llama-cli --list-devices --gpus:`ˋˋ
ggml_opencl: selected platform: 'clvk'
ggml_opencl: device: 'Turnip Adreno (TM) 740v3 (OpenCL 3.0 CLVK on Vulkan v1.4.328 driver 104869888)'
ggml_opencl: OpenCL driver: 3.0 CLVK on Vulkan v1.4.328 driver 104869888
ggml_opencl: vector subgroup broadcast support: false
ggml_opencl: device FP16 support: true
ggml_opencl: device does not support subgroups (cl_khr_subgroups or cl_intel_subgroups) (note that subgroups is an optional feature in OpenCL 3.0)
ggml_opencl: drop unsupported device.
ggml_opencl: device: 'llvmpipe (LLVM 21.1.5, 128 bits) (OpenCL 3.0 CLVK on Vulkan v1.4.328 driver 104869888)'
Unsupported GPU: llvmpipe (LLVM 21.1.5, 128 bits)
ggml_opencl: drop unsupported device.
load_backend: loaded OpenCL backend from /data/data/com.termux/files/usr/bin/../lib/libggml-opencl.so
load_backend: loaded CPU backend from /data/data/com.termux/files/usr/bin/../lib/libggml-cpu.so
Available devices:
`ˋˋ
While this is the result of clinfo:
clinfo.txt
It seems at least some subgroup operations are supported.
I have tried disabling the check for subgroup features in the code and recompiling, but it seems the check is right and something is actually missing, as the model will fail to load as something won't be computed correctly (not really an OpenCL expert...). I have also tried to use the compilation flag GGML_OPENCL_USE_ADRENO_KERNELS=OFF to see if this would avoid certain kernel operations, but the result is the same.
My question now is: am I facing the limits of current gpu support in termux, e.g. drivers are lacking, or is the chip actually incapable of performing these operations? What else could I try?
Beta Was this translation helpful? Give feedback.
All reactions