Skip to content

Releases: teleprint-me/llama.cpp

b4033

05 Nov 01:35
a9e8a9a
Compare
Choose a tag to compare
ggml : fix arch check in bf16_to_fp32 (#10164)

b4020

04 Nov 07:57
9f40989
Compare
Choose a tag to compare
ggml : move CPU backend to a separate file (#10144)

b3995

30 Oct 22:38
61408e7
Compare
Choose a tag to compare
kompute: add backend registry / device interfaces (#10045)

Get in line with the other backends by supporting the newer
backend/device registry interfaces.

Signed-off-by: Sergio Lopez <slp@redhat.com>

b3987

28 Oct 19:30
61715d5
Compare
Choose a tag to compare
llama : Add IBM granite template (#10013)

* Add granite template to llama.cpp

* Add granite template to test-chat-template.cpp

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* Update tests/test-chat-template.cpp

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* Added proper template and expected output

* Small change to \n

Small change to \n

* Add code space &

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* Fix spacing

* Apply suggestions from code review

* Update src/llama.cpp

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

b3982

26 Oct 23:48
cc2983d
Compare
Choose a tag to compare
sync : ggml

b3974

24 Oct 21:35
958367b
Compare
Choose a tag to compare
server : refactor slot input data, move tokenizer to HTTP thread (#10…

b3970

24 Oct 05:51
0a1c750
Compare
Choose a tag to compare
server : samplers accept the prompt correctly (#10019)

b3943

19 Oct 21:28
cda0e4b
Compare
Choose a tag to compare
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)

* refactor llama_batch_get_one

* adapt all examples

* fix simple.cpp

* fix llama_bench

* fix

* fix context shifting

* free batch before return

* use common_batch_add, reuse llama_batch in loop

* null terminated seq_id list

* fix save-load-state example

* fix perplexity

* correct token pos in llama_batch_allocr

b3934

17 Oct 08:29
3752217
Compare
Choose a tag to compare
readme : update bindings list (#9918)

Co-authored-by: Tim Wang <tim.wang@ing.com>

b3922

15 Oct 19:54
755a9b2
Compare
Choose a tag to compare
llama : add infill sampler (#9896)

ggml-ci