-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Adept Persimmon 8b #3410
Support Adept Persimmon 8b #3410
Conversation
…hillip-kravtsov/support-adept-persimmon-8b
…kravtsov/support-adept-persimmon-8b
…kravtsov/support-adept-persimmon-8b
… .saftensors file
…kravtsov/support-adept-persimmon-8b
…kravtsov/support-adept-persimmon-8b
…kravtsov/support-adept-persimmon-8b
Let's resolve the CI fails and merge |
…kravtsov/support-adept-persimmon-8b. ggml-ci
92acb44
to
5d259d3
Compare
…kravtsov/support-adept-persimmon-8b
The switches in |
@phillip-kravtsov PTAL at @slaren's comment and fix as necessary |
I got tired of seeing the compiler warning and created #3535 (not sure if there are any other issues, haven't had a chance to test it yet). |
Thanks for the fix @KerfuffleV2 -- that PR should be sufficient. |
…example * 'master' of github.com:ggerganov/llama.cpp: py : change version of numpy requirement to 1.24.4 (ggerganov#3515) quantize : fail fast on write errors (ggerganov#3521) metal : support default.metallib load & reuse code for swift package (ggerganov#3522) llm : support Adept Persimmon 8B (ggerganov#3410) Fix for ggerganov#3454 (ggerganov#3455) readme : update models, cuda + ppl instructions (ggerganov#3510) server : docs fix default values and add n_probs (ggerganov#3506)
To support Partial RoPE & Squared ReLU, this PR adds concat & square kernels for metal.
I've confirmed agreement between the GGML & HF implementation up to tensor values in the last layer.