[pull] master from ggerganov:master #135

* cann: fix multi-npu exec error * cann: update comment for ggml_backend_cann_supports_buft

This commit adds a --no-warmup option for llama-cli. The motivation for this is that it can be convenient to skip the warmup llama_decode call when debugging. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* llama : model-based max number of graph nodes ggml-ci * llama : disable 405B max_nodes path due to lack of complaints ggml-ci

* Add llama 3.1 rope scaling factors to llama conversion and inference This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192 * Update convert_hf_to_gguf.py Co-authored-by: compilade <git@compilade.net> * address comments * address comments * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * Update convert_hf_to_gguf.py Co-authored-by: compilade <git@compilade.net> --------- Co-authored-by: compilade <git@compilade.net>

This commit removes an UNUSED macro call that is not needed as the variable n0 is used in the code and will not produce a warning. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

…ggml/885)

…893) This prevents invalid frees when destroying a partially initialized vk_buffer_struct. For example, this could happen in ggml_vk_create_buffer when running out of device memory. Co-authored-by: Tony Wasserka <neobrain@users.noreply.github.com>

…ml/895) * Add support for float16 tensors in 1d pooling operations * Add support for float16 input tensors in 2d pooling operations * code cleanup remove unnecessary casting during srow ptr initialization --------- Co-authored-by: vanaka11 <vanaka1189@gmail.com>

Apply a loop tiling technique to the generic path, which provides performance upside for ISAs with enough registers to take advantage of it. Also helps the compiler optimize this path.

ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggerganov:master #135

[pull] master from ggerganov:master #135

Commits on Jul 27, 2024