Tags · Buzzoola/llama.cpp

b5190

Force FP32 compute in GLM4 FFN Down (ggml-org#13101)

* Force FP32 compute in cuBLAS GEMM

* Revert "Force FP32 compute in cuBLAS GEMM"

This reverts commit 6efd872.

* Force F32 compute in GLM4 ffn down

* Edit comment to clarify issue

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

Apr 25, 2025
558a764
zip
tar.gz

b5189

clip : fix pixtral on some GPU backends (ggml-org#13097)

* clip : fix pixtral on some GPU backends

* refactor inp_raw set

* rm outdated comment

* fix dynamic size

* add TODO

Apr 25, 2025
edb18b6
zip
tar.gz

b5188

change the reorder tensor from init to execute OP (ggml-org#13003)

Apr 25, 2025
514c456
zip
tar.gz

b5187

rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (ggml-…

…org#12943)

RPC_CMD_SET_TENSOR always returns an empty response and we send this 4
times per token. We can improve TG speed if we don't wait for this empty
response.

The performance impact of this change depends on the network latency.

Apr 25, 2025
553a5c3
zip
tar.gz

b5186

clip : remove boi/eoi embeddings for GLM-edge model (ggml-org#13081)

Apr 24, 2025
13be08d
zip
tar.gz

b5185

embeddings : fix batch sizes (ggml-org#13076)

ggml-ci

Apr 24, 2025
226251e
zip
tar.gz

b5184

ggml : fix trailing whitespaces (#0)

Apr 24, 2025
87616f0
zip
tar.gz

b5181

CUDA: use switch statements in constexpr functions (ggml-org#13095)

Apr 24, 2025
b10d8bf
zip
tar.gz

b5180

cmake : do not include ./src as public for libllama (ggml-org#13062)

* cmake : do not include ./src as public for libllama

ggml-ci

* cmake : rework tests

ggml-ci

* llguidance : remove unicode include

ggml-ci

* cmake : make c++17 private

ggml-ci

Apr 24, 2025
13b4548
zip
tar.gz

b5178

arg : add --no-mmproj-offload (ggml-org#13093)

* arg : add --no-mmproj-offload

* Update common/arg.cpp

Apr 24, 2025
7c727fb
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b5190

b5189

b5188

b5187

b5186

b5185

b5184

b5181

b5180

b5178

Tags: Buzzoola/llama.cpp