Segfault when compiling with make LLAMA_CUBLAS=1 #3054

atopheim · 2023-09-07T09:30:22Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I am trying to run nvidia-accelerated model inference since I have a 3080Ti laptop. So I built using make LLAMA_CUBLAS=1

Current Behavior

make LLAMA_CUBLAS=1
gdb ./main
(No debugging symbols found in ./main)
(gdb) set args -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"
(gdb) run
Starting program: /home/torbjorn/Documents/Github/atopheim/llama.cpp/main -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
(gdb) backtrace
#0 0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
1 0x00007fffcef5a8a4 in std::char_traits::copy (__n=, __s2=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf", __s1=0x0) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:445
2 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:420
3 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:415
4 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_replace (this=0x7fffffffc1b0, __pos=, __len1=, __s=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf",
__len2=) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:537
5 0x00005555555f892c in gpt_params_parse(int, char**, gpt_params&) ()
6 0x0000555555564b0b in main ()

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 5900HX with Radeon Graphics
CPU family: 25
Model: 80
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU(s) scaling MHz: 70%
CPU max MHz: 4888,7690
CPU min MHz: 1200,0000
BogoMIPS: 6590,14
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 256 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15

$ uname -a
6.1.44-1-MANJARO (hashtag)1 SMP PREEMPT_DYNAMIC Wed Aug 9 09:02:26 UTC 2023 x86_64 GNU/Linux

SDK version, e.g. for Linux:

$ python3 --version
Python 3.11.3
$ make --version
GNU Make 4.4.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$ g++ --version
g++ (GCC) 12.3.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The text was updated successfully, but these errors were encountered:

atopheim · 2023-09-07T09:38:59Z

I had better luck with cmake

JohannesGaessler · 2023-09-07T11:09:29Z

Compiling with LLAMA_DEBUG=1 adds debugging symbols and will allow GDB to determine which exact lines are causing the segfault; would be useful information for devs.

KerfuffleV2 · 2023-09-07T11:53:54Z

What JG said.

This is a pretty weird error, it looks like it crashed parsing the commandline arguments but there's nothing unusual about them. (I also can't reproduce this with identical arguments.) If you build without LLAMA_CUBLAS=1 do you still get the error?

atopheim · 2023-09-07T12:38:30Z

Building just using make LLAMA_DEBUG=1 works

The following is trying the same with LLAMA_CUBLAS=1 as well, then running gdb on that.

$ rm main
$ make LLAMA_CUBLAS=1 LLAMA_DEBUG=1

I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: x86_64
I CFLAGS: -I. -Icommon -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -O3 -std=c11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -march=native -mtune=native
I CXXFLAGS: -I. -Icommon -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -O3 -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -march=native -mtune=native
I LDFLAGS: -g -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib
I CC: cc (GCC) 13.2.1 20230801
I CXX: g++ (GCC) 12.3.0

g++ -I. -Icommon -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -O3 -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o console.o grammar-parser.o k_quants.o ggml-cuda.o ggml-alloc.o -o main -g -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib

==== Run ./main -h for help. ====

$ gdb ./main
(gdb) set args -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"
(gdb) run
Starting program: /home/torbjorn/Documents/Github/atopheim/llama.cpp/main -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"

This GDB supports auto-downloading debuginfo from the following URLs:
https://debuginfod.archlinux.org
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for /opt/cuda/lib64/libcublas.so.12
Downloading separate debug info for /opt/cuda/lib64/libcudart.so.12
Downloading separate debug info for /opt/cuda/lib64/libcublasLt.so.12
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Downloading separate debug info for /usr/lib/librt.so.1
Downloading separate debug info for /usr/lib/libpthread.so.0
Downloading separate debug info for /usr/lib/libdl.so.2
Downloading separate debug info for /usr/lib/libcuda.so.1

Program received signal SIGSEGV, Segmentation fault.
0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
(gdb) backtrace
0 0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
1 0x00007fffcef5a8a4 in std::char_traits::copy (__n=, __s2=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __s1=0x0) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:445
2 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:420
3 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:415
4 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_replace (this=0x7fffffffc0c0, __pos=, __len1=, __s=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __len2=)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:537
5 0x0000555555602c9c in gpt_params_parse(int, char**, gpt_params&) ()
6 0x00005555555688e3 in main (argc=5, argv=0x7fffffffd778) at examples/main/main.cpp:112

KerfuffleV2 · 2023-09-07T12:42:01Z

I think you need to make clean then try make LLAMA_CUBLAS=1 LLAMA_DEBUG=1 again. Also, please mention exactly what commit/release you're on.

staviq · 2023-09-07T17:38:16Z

@atopheim Does it still crash if you simply run ./main with no arguments ? ( #2922 )

cebtenzzre · 2023-09-07T17:45:56Z

If you build with this command, it will provide a detailed explanation for segfaults:

LDFLAGS='-fsanitize=address' CFLAGS="$LDFLAGS -fno-omit-frame-pointer" CXXFLAGS=$CFLAGS make LLAMA_DEBUG=1 LLAMA_CUBLAS=1

I would also be interested to know if this occurs with a smaller model like LLaMA v2 7b that would be easier to test.

atopheim · 2023-09-07T18:17:24Z

I have verified that make clean and rebuilding actually worked.

atopheim · 2023-09-07T18:19:33Z

While I had the issue, it was also occuring with 7B models

atopheim · 2023-09-07T18:20:31Z

I never tried running it without arguments, @staviq

cebtenzzre · 2023-09-07T18:50:06Z

Could you confirm that building with make clean && make LLAMA_CUBLAS=1 (no DEBUG) also works?

JamyDon · 2023-09-17T04:04:55Z

First I built with simply make and found that it would only run on my CPU. So I went on to build with make LLAMA_CUBLAS=1 and then got Segfault when running the model. Following this issue, I tried make clean && make LLAMA_CUBLAS=1 and it is now working pretty well.

So the Segfault might be because of the remains of the previous make, since after simply make clean the Segfault never emerges.

cebtenzzre closed this as not planned Won't fix, can't repro, duplicate, stale Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

atopheim commented Sep 7, 2023 •

edited

Loading

atopheim commented Sep 7, 2023

JohannesGaessler commented Sep 7, 2023 •

edited

Loading

KerfuffleV2 commented Sep 7, 2023

atopheim commented Sep 7, 2023

KerfuffleV2 commented Sep 7, 2023 •

edited

Loading

staviq commented Sep 7, 2023

cebtenzzre commented Sep 7, 2023 •

edited

Loading

atopheim commented Sep 7, 2023 •

edited

Loading

atopheim commented Sep 7, 2023

atopheim commented Sep 7, 2023

cebtenzzre commented Sep 7, 2023

JamyDon commented Sep 17, 2023

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

Comments

atopheim commented Sep 7, 2023 • edited Loading

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

atopheim commented Sep 7, 2023

JohannesGaessler commented Sep 7, 2023 • edited Loading

KerfuffleV2 commented Sep 7, 2023

atopheim commented Sep 7, 2023

KerfuffleV2 commented Sep 7, 2023 • edited Loading

staviq commented Sep 7, 2023

cebtenzzre commented Sep 7, 2023 • edited Loading

atopheim commented Sep 7, 2023 • edited Loading

atopheim commented Sep 7, 2023

atopheim commented Sep 7, 2023

cebtenzzre commented Sep 7, 2023

JamyDon commented Sep 17, 2023

atopheim commented Sep 7, 2023 •

edited

Loading

JohannesGaessler commented Sep 7, 2023 •

edited

Loading

KerfuffleV2 commented Sep 7, 2023 •

edited

Loading

cebtenzzre commented Sep 7, 2023 •

edited

Loading

atopheim commented Sep 7, 2023 •

edited

Loading