Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impact of bf16 on Llama 3 8B perplexity? #7148

Closed
4 tasks done
jim-plus opened this issue May 8, 2024 · 3 comments
Closed
4 tasks done

Impact of bf16 on Llama 3 8B perplexity? #7148

jim-plus opened this issue May 8, 2024 · 3 comments
Labels
enhancement New feature or request stale

Comments

@jim-plus
Copy link

jim-plus commented May 8, 2024

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

The LLaMA 3 8b Scoreboard on the following link was computed against fp16. https://github.com/ggerganov/llama.cpp/tree/master/examples/perplexity
However, the model was released as bf16 weights. Is there a quantifiable negative impact on perplexity due to conversion between weight formats? Or a difference when compating perplexity against bf16 instead of fp16? It's unclear. Even a brief mention of this could bring clarity.

Motivation

Curiosity about the impact of bf16 versus fp16 on models, and subsequent training/merging.

Possible Implementation

If you have an idea as to how it can be implemented, please write a detailed description. Feel free to give links to external sources or share visuals that might be helpful to understand the details better.

@jim-plus jim-plus added the enhancement New feature or request label May 8, 2024
@JohannesGaessler
Copy link
Collaborator

I did not explicitly check the effect of FP16/BF16 as an intermediary but when using them directly I basically found no relevant differences: #7150 .
And because the FP16 vs. BF16 differences seem to be much smaller than even the FP16 vs. q8_0 differences I think it's safe to just FP16 even if the original weights are BF16.
Note: if the original weights contain values larger than the max. representable FP16 value that could potentially cause issues but you would run into those anyways once you do the conversion to the final quant format.

@arnfaldur
Copy link

arnfaldur commented May 14, 2024

https://github.com/ggerganov/llama.cpp/tree/master/examples/perplexity#llama-3-bf16-vs-fp16-comparison

The results have been added here. From the work done here #7150

This issue can be closed.

@github-actions github-actions bot added the stale label Jun 14, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants