Skip to content

Something might be wrong with either llama.cpp or the Llama 3 GGUFs #6914

Closed
@coder543

Description

@coder543

Try this query: "What is 3333+777?"

Yes, yes, LLMs are bad at math. That's not what I'm getting at. Someone mentioned this on Reddit, and I have to agree that I'm seeing weird stuff too.

Let's get a baseline. Here is what meta.ai yields:

image

This is likely running on Llama 3 70B.

Here is what Groq yields:

image

and at 8B:

image

Now, here's where things get weird. Using Open WebUI on top of Ollama, let's use llama.cpp to run the GGUFs of Llama 3.

First, 8B at fp16:

image

Then 8B at Q8_0:

image

Then 70B at Q4_0:

image

I think the problem should be clear. All of the non-llama.cpp instances that were not using GGUFs did the math problem correctly. All of the llama.cpp instances got the problem wrong in exactly the same way. This issue is extremely repeatable on both ends. I have never seen the cloud instances make this mistake, and I have never seen the llama.cpp instances not make this exact mistake of adding an extra digit to the problem and then getting it wrong.

To me, it appears that something is degrading the accuracy of Llama 3 when run under llama.cpp.

Any ideas of what's going wrong here?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions