Revert LLAMA_NATIVE to OFF in flake.nix #5066
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've noticed that since PR #4605, performance (CPU-only) took a massive dive when using the Nix flake (I went from ~4 tokens/s to <0.5). It seems that the slowdown is caused by
LLAMA_NATIVE=ON
. Reverting toOFF
(as it was before the PR) restores the expected performance.This regression was observed on both an i7-1165G7 and a Ryzen 3800X running NixOS.
FWIW, the llama-cpp package in nixpkgs has
LLAMA_NATIVE=OFF
.I'm not sure what the implications of turning off
LLAMA_NATIVE
are, maybe @philiptaron and @SomeoneSerge want to chime in.