Skip to content

CUDA error: invalid device function when compiling and running for amd gfx 1032 #4762

Closed
@nasawyer7

Description

@nasawyer7

Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
I have a 6700s amd gpu, 8gb vram. I got ooga to work on this computer, but I can't get llama.ccp to work. I compiled with
make clean && make -j16 LLAMA_HIPBLAS=1 AMDGPU_TARGETS=gxf1032
And everything went fine. However, when I try to run, I do export HSA_OVERRIDE_GFX_VERSION=10.3.0
then HIP_VISIBLE_DEVICES=0 ./main -ngl 50 -m /home/lenovoubuntu/Downloads/text-generation-webui-main/models/dolphin-2.6-mistral-7b-dpo.Q4_K_M.gguf -p "Write a function in TypeScript that sums numbers".
(I do HIP devices function since my devices has an igpu as well).

It returns .................................................................................................
llama_new_context_with_model: n_ctx = 512
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: VRAM kv self = 64.00 MB
llama_new_context_with_model: KV self size = 64.00 MiB, K (f16): 32.00 MiB, V (f16): 32.00 MiB
llama_build_graph: non-view tensors processed: 676/676
llama_new_context_with_model: compute buffer total size = 76.19 MiB
llama_new_context_with_model: VRAM scratch buffer: 73.00 MiB
llama_new_context_with_model: total VRAM used: 4232.06 MiB (model: 4095.06 MiB, context: 137.00 MiB)
CUDA error: invalid device function
current device: 0, in function ggml_cuda_op_flatten at ggml-cuda.cu:7971
hipGetLastError()
GGML_ASSERT: ggml-cuda.cu:226: !"CUDA error"
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)

So, I ran it as as sudo, as it suggested using this command. sudo LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH HSA_OVERRIDE_GFX_VERSION=10.3.0 HIP_VISIBLE_DEVICES=0 ./main -ngl 50 -m /home/lenovoubuntu/Downloads/text-generation-webui-main/models/dolphin-2.6-mistral-7b-dpo.Q4_K_M.gguf -p "Write a function in TypeScript that sums numbers"
I used all of those environment variables since ooga required them, and I was hoping they would fix things here too.

However, that just returns this after seemingly loading the model.

CUDA error: invalid device function
current device: 0, in function ggml_cuda_op_flatten at ggml-cuda.cu:7971
hipGetLastError()
GGML_ASSERT: ggml-cuda.cu:226: !"CUDA error"
[New LWP 23593]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f34398ea42f in __GI___wait4 (pid=23599, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0 0x00007f34398ea42f in __GI___wait4 (pid=23599, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 in ../sysdeps/unix/sysv/linux/wait4.c
#1 0x000055fb56cca7fb in ggml_print_backtrace ()
#2 0x000055fb56d90f95 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) ()
#3 0x000055fb56d9da1e in ggml_cuda_op_flatten(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, void ()(ggml_tensor const, ggml_tensor const*, ggml_tensor*, float const*, float const*, float*, ihipStream_t*)) ()
#4 0x000055fb56d92df3 in ggml_cuda_compute_forward ()
#5 0x000055fb56cf8898 in ggml_graph_compute_thread ()
#6 0x000055fb56cfca98 in ggml_graph_compute ()
#7 0x000055fb56dbc41e in ggml_backend_cpu_graph_compute ()
#8 0x000055fb56dbcf0b in ggml_backend_graph_compute ()
#9 0x000055fb56d2b046 in llama_decode_internal(llama_context&, llama_batch) ()
#10 0x000055fb56d2bb63 in llama_decode ()
#11 0x000055fb56d66316 in llama_init_from_gpt_params(gpt_params&) ()
#12 0x000055fb56cbc31a in main ()
[Inferior 1 (process 23582) detached]
Aborted

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions