Something might be wrong with either llama.cpp or the Llama 3 GGUFs

Try this query: "What is 3333+777?"

Yes, yes, LLMs are bad at math. That's not what I'm getting at. [Someone mentioned this on Reddit](https://www.reddit.com/r/LocalLLaMA/comments/1cci5w6/quantizing_llama_3_8b_seems_more_harmful_compared/l169ovf/), and I have to agree that I'm seeing weird stuff too.

Let's get a baseline. Here is what [meta.ai](https://meta.ai) yields:

![image](https://github.com/ggerganov/llama.cpp/assets/726063/b4555bf8-aa7b-4156-a0c0-2fa3c6353110)

This is likely running on Llama 3 70B.

Here is what Groq yields:

![image](https://github.com/ggerganov/llama.cpp/assets/726063/4e8813e0-7cf3-4c7d-8109-b34c8523091c)

and at 8B:

![image](https://github.com/ggerganov/llama.cpp/assets/726063/12fbba62-c7b8-4eef-8454-0093ed0dc7ab)

Now, here's where things get weird. Using Open WebUI on top of Ollama, let's use llama.cpp to run the GGUFs of Llama 3.

First, 8B at fp16:

![image](https://github.com/ggerganov/llama.cpp/assets/726063/da103655-43b7-44cf-a749-7c9cff2576a0)

Then 8B at Q8_0:

![image](https://github.com/ggerganov/llama.cpp/assets/726063/2a258ae2-494c-4ba0-9c95-af8e2c2ce418)

Then 70B at Q4_0:

![image](https://github.com/ggerganov/llama.cpp/assets/726063/0207025b-c7ca-4a05-b540-4887897e0498)

I think the problem should be clear. All of the non-llama.cpp instances that were _not_ using GGUFs did the math problem correctly. *All* of the llama.cpp instances got the problem wrong in exactly the same way. This issue is _extremely_ repeatable on both ends. I have never seen the cloud instances make this mistake, and I have never seen the llama.cpp instances *not* make this exact mistake of adding an extra digit to the problem and then getting it wrong.

To me, it appears that something is degrading the accuracy of Llama 3 when run under llama.cpp.

Any ideas of what's going wrong here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Something might be wrong with either llama.cpp or the Llama 3 GGUFs #6914

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Something might be wrong with either llama.cpp or the Llama 3 GGUFs #6914

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions