Skip to content

Releases: ngxson/llama.cpp

b4027

04 Nov 13:38
ea02c75
Compare
Choose a tag to compare
cuda : clear error after changing peer access (#10153)

b4024

04 Nov 12:54
329ed91
Compare
Choose a tag to compare
CANN: adjust backend registry refactor. (#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

b4023

04 Nov 10:33
ce027ad
Compare
Choose a tag to compare
sync : ggml

b4020

03 Nov 20:35
9f40989
Compare
Choose a tag to compare
ggml : move CPU backend to a separate file (#10144)

b4019

03 Nov 14:32
08828a6
Compare
Choose a tag to compare
metal : minor fixup in FA kernel (#10143)

* metal : minor fixup in FA kernel

ggml-ci

* metal : use the unrolled loop variable

* metal : remove unused var

b4016

02 Nov 18:34
42cadc7
Compare
Choose a tag to compare
server : fix slot selection by lru (#10126)

* server : fix slot selection by lru, migrate lcs to `size_t`

* minor debug log fix

b4014

02 Nov 14:36
1926d6e
Compare
Choose a tag to compare
llama : adjust default context size + print warnings (#10136)

* llama : adjust default context size + print warnings

ggml-ci

* ggml-ci : add missing gpu-layers + adjust context sizes

b4013

02 Nov 13:24
b634f8a
Compare
Choose a tag to compare
simple-chat : only add bos on first prompt (#10129)

b4011

02 Nov 00:38
a6744e4
Compare
Choose a tag to compare
llama : add simple-chat example (#10124)

* llama : add simple-chat example

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

b4009

01 Nov 20:35
418f5ee
Compare
Choose a tag to compare
vulkan : improve ggml_vk_create_buffer error handling (#9898)