Skip to content

Is it normal that ROCm+HIPBLAS produces different results than on CPU or breaks completely? #6841

Closed
@ghost

Description

Hello. I did some perplexity tests while investigating issue with Llama-3. Initially the issue was that Llama-3 base 70b model outputs garbage with small quants with iMatrix. Can't find any information regarding that ROCm possibly causes corruption.

GPU test (RX 7600 + RX 7600 XT)

https://huggingface.co/mradermacher/Meta-Llama-3-70B-i1-GGUF/tree/main
Meta-Llama-3-70B.i1-Q2_K.gguf prints [1]-nan,[2]-nan,[3]-nan,[4]-nan with -ngl 30 or 0 (prints garbage unless -ngl 0)
https://huggingface.co/mradermacher/Meta-Llama-3-70B-GGUF/tree/main
Meta-Llama-3-70B.Q2_K.gguf - seems OK, [1]4.1839,[2]4.7300,[3]4.2751,[4]4.6444,[5]4.6942,[6]5.0426,[7]5.1405,[8]5.4747
Final estimate: PPL = 5.9315 +/- 0.03553

Pure CPU test

Meta-Llama-3-70B.i1-Q2_K.gguf with pure CPU 'perplexity' build (146 seconds per 512 tokens - ETA 26 hours 55.67 minutes)
[1]6.3962,[2]7.1886,[3]6.9886,[4]7.3853,[5]7.8924,[6]8.2982,[7]8.8956,[8]9.3799, (can't wait for many hours, stopped)
Meta-Llama-3-70B.Q2_K.gguf (static Q2_K):
[1]4.1675,[2]4.6952,[3]4.2374,[4]4.6452,[5]4.6677,[6]5.0459,[7]5.1258,[8]5.4649,^C
It's slightly better than on ROCm but the difference is very small.

I also found strange holes in the imatrix.dat that was used:
Screenshot from 2024-04-23 13-34-09
But the author seems uninterested in discussing that.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions