Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert LLAMA_NATIVE to OFF in flake.nix #5066

Merged
merged 1 commit into from
Jan 21, 2024

Conversation

iSma
Copy link
Contributor

@iSma iSma commented Jan 21, 2024

I've noticed that since PR #4605, performance (CPU-only) took a massive dive when using the Nix flake (I went from ~4 tokens/s to <0.5). It seems that the slowdown is caused by LLAMA_NATIVE=ON. Reverting to OFF (as it was before the PR) restores the expected performance.

This regression was observed on both an i7-1165G7 and a Ryzen 3800X running NixOS.

FWIW, the llama-cpp package in nixpkgs has LLAMA_NATIVE=OFF.

I'm not sure what the implications of turning off LLAMA_NATIVE are, maybe @philiptaron and @SomeoneSerge want to chime in.

@SomeoneSerge
Copy link
Collaborator

SomeoneSerge commented Jan 21, 2024

option(LLAMA_NATIVE "llama: enable -march=native flag" ON)

Oh yes, we surely would prefer that OFF. Ideally, we never resort to -march=native (which generates random outputs depending on the builder's scheduler, load, and hardware), but instead model concrete targets or concrete architecture levels as part of the derivation

@SomeoneSerge SomeoneSerge added the nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment label Jan 21, 2024
@SomeoneSerge SomeoneSerge merged commit 504dc37 into ggerganov:master Jan 21, 2024
16 checks passed
Copy link
Collaborator

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang, that's my fault in doing the transcription. Call it an "off by on" error 😅 . Thanks for the PR; LGTM.

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants