Skip to content

Misc. bug: Inconsistent Vulkan segfault #10528

Open
@RobbyCBennett

Description

@RobbyCBennett

Name and Version

library 531cb1c (gguf-v0.4.0-2819-g531cb1c2)

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

No response

Problem description & steps to reproduce

  1. Compile the program below
  2. Run it a thousand times and it will probably have a segmentation fault at least once. I used the gdb debugger.

Simple program:

#include "llama.h"

static void handleLog(enum ggml_log_level level, const char *text, void *user_data) {}

int main(int argc, char **argv)
{
  llama_log_set(handleLog, 0);

  char path[] = "/your-path-to/llama.cpp/models/ggml-vocab-llama-bpe.gguf";
  struct llama_model_params params = llama_model_default_params();
  struct llama_model *model = llama_load_model_from_file(path, params);
  llama_free_model(model);

  return 0;
}

Shell script to run the program several times:

#! /bin/sh

PROGRAM=llama-bug
LOG=debug.log
COUNT=1000

rm -f "$LOG"

for i in `seq 1 $COUNT`; do
	gdb -batch -ex run -ex bt "$PROGRAM" >> "$LOG" 2>> "$LOG"
done

First Bad Commit

No response

Relevant log output

GDB output from crash caused by /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.183.01

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
ggml_vulkan: Compiling shaders..............................Done!

Thread 3 "[vkrt] Analysis" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe35a8640 (LWP 1789333)]
0x00007fffeff1cb00 in ?? () from /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.183.01
#0  0x00007fffeff1cb00 in ?? () from /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.183.01
#1  0x00007ffff0246f1d in ?? () from /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.183.01
#2  0x00007fffeff1fcfa in ?? () from /lib/x86_64-linux-gnu/libnvidia-eglcore.so.535.183.01
#3  0x00007ffff7a1dac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#4  0x00007ffff7aaf850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

GDB output from crash with unknown cause

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
ggml_vulkan: Compiling shaders..............................Done!

Thread 3 "[vkrt] Analysis" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe35a8640 (LWP 1750868)]
0x00007fffeff1cb00 in ?? ()
#0  0x00007fffeff1cb00 in ?? ()
#1  0x000000006746139a in ?? ()
#2  0x0000000002a1b0d8 in ?? ()
#3  0x0000000067461399 in ?? ()
#4  0x00000000000e6817 in ?? ()
#5  0x00005555561076c0 in ?? ()
#6  0x00007fffeff1ef10 in ?? ()
#7  0x0000000000000000 in ?? ()

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions