Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan frontend breaking and causing garbled output after around 500 tokens, and causing GPU to crash. #5179

Closed
gnawzie opened this issue Jan 29, 2024 · 3 comments · Fixed by #5223
Assignees
Labels
bug Something isn't working

Comments

@gnawzie
Copy link

gnawzie commented Jan 29, 2024

AMD 6800xt GPU, 32GB system RAM.
Arch Linux

Vulkan works fast for the good text it outputs, but results in garbled output past some number of tokens.

./server --n-gpu-layers 46 --model models/LLaMA2-13B-Psyfighter2.Q5_K_M.gguf

Here is a sample of text near the point of failure.

We have faced many trials in our past: wars fought for principles that seemed long forgotten; struggles against oppression at home and abroad; movements for civil rights that shook the very foundations of society. Through it all, we emerged stronger and more unified than ever before, for we knew deep down inside that together, there was nothing we could not achieve.

And so, my friends, as we stand on the precipice of a new era; an age defined by technological advancement, global interconnectedness, and unprecedented social progressadvраб operationitelkeltc ScherobatiotviksochitelockUBtc/(ховgenceariestcwigotaetwork Hopété Nasleratimer(&лет#

It also caused my GPU to crash on a previous boot with these error messages.

amdgpu 0000:0b:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:3
2773, for process server pid 27706 thread server pid 27706)
kernel: amdgpu 0000:0b:00.0: amdgpu:   in page starting at address 0x0000000000001000 fro
m client 0x1b (UTCL2)
kernel: amdgpu 0000:0b:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501E31
kernel: amdgpu 0000:0b:00.0: amdgpu:          Faulty UTCL2 client ID: GCR (0xf)
kernel: amdgpu 0000:0b:00.0: amdgpu:          MORE_FAULTS: 0x1
kernel: amdgpu 0000:0b:00.0: amdgpu:          WALKER_ERROR: 0x0
kernel: amdgpu 0000:0b:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
kernel: amdgpu 0000:0b:00.0: amdgpu:          MAPPING_ERROR: 0x0

I hope this helps. Thanks!

@LostRuins
Copy link
Collaborator

I have a feeling that it is triggered after a context shift takes place llama_kv_cache_seq_rm + llama_kv_cache_seq_shift . Mine always seems to segfault around that point. Does not happen if 0 GPU layers are offloaded.

@teleprint-me
Copy link
Contributor

Which precision are you using? E.g. f16, q8_0, etc?

@0cc4m 0cc4m self-assigned this Jan 29, 2024
@0cc4m 0cc4m added bug Something isn't working and removed bug-unconfirmed labels Jan 29, 2024
@0cc4m
Copy link
Collaborator

0cc4m commented Jan 29, 2024

I can reproduce this. I'll try to debug and fix it.

Edit: I can reproduce the garbled output and a segfault, the GPU crash is your driver's fault.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants