Skip to content

Releases: ggerganov/llama.cpp

b2463

19 Mar 10:09
b80cf3b
Compare
Choose a tag to compare
common : disable repeat penalties by default (#6127)

b2462

19 Mar 09:29
970a480
Compare
Choose a tag to compare
ci : exempt some labels from being tagged as stale (#6140)

b2461

19 Mar 07:02
4c28b82
Compare
Choose a tag to compare
common : print usage on '-h' and '--help' (#6145)

b2459

18 Mar 20:30
d199ca7
Compare
Choose a tag to compare
mpt : implement backwards compatiblity with duped output tensor (#6139)

b2458

18 Mar 19:37
104f5e0
Compare
Choose a tag to compare
clip : fix memory leak (#6138)

b2457

18 Mar 17:05
5e1b7f9
Compare
Choose a tag to compare
backend : set max split inputs to GGML_MAX_SRC (#6137)

b2456

18 Mar 12:56
ac9ee6a
Compare
Choose a tag to compare
ci : disable stale issue messages (#6126)

b2455

18 Mar 12:44
4f6d133
Compare
Choose a tag to compare
ci : temporary disable sanitizer builds (#6128)

b2454

18 Mar 10:55
2bf8d0f
Compare
Choose a tag to compare
backend : offload large batches to GPU (#6083)

* backend : offload large batches to GPU

* fix hip

* code cleanup

* fix CUDA split buffers

* Update ggml-backend-impl.h

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* cuda : fix memset without set_device

* imatrix : remove sched affix from weight names

* sched : add a new split if the current one has too many inputs
reduce max inputs per split
more cleanup

* update backends

ggml-ci

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

b2453

18 Mar 10:10
496bc79
Compare
Choose a tag to compare
common : tidy-up argument parsing (#6105)

* Tidy-up argument parsing.

* Missing ref.

* common : minor

* common : add static classifier

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>