Tags · lshzh-ww/llama.cpp

master-c9c74b4

llama : add classifier-free guidance (ggerganov#2135)

* Initial implementation

* Remove debug print

* Restore signature of llama_init_from_gpt_params

* Free guidance context

* Make freeing of guidance_ctx conditional

* Make Classifier-Free Guidance a sampling function

* Correct typo. CFG already means context-free grammar.

* Record sampling time in llama_sample_classifier_free_guidance

* Shift all values by the max value before applying logsoftmax

* Fix styling based on review

Jul 11, 2023
c9c74b4
zip
tar.gz

master-bbef282

Possible solution to allow K-quants on models with n_vocab!=32000 (gg…

…erganov#2148)

* This allows LLAMA models that were previously incompatible with K quants to function mostly as normal. This happens when a model has a vocab != 32000, e.g 32001 which means it's not divisible by 256 or 64. Since the problematic dimensions only apply for `tok_embeddings.weight` and `output.weight` (dimentions 4096 x n_vocab), we can simply quantize these layers to Q8_0 whereas the majority of the hidden layers are still K-quanted since they have compatible dimensions.

* Fix indentation

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* As an alternative, to avoid failing on Metal due to lack of Q8_0 support, instead quantize tok_embeddings.weight to Q4_0 and retain output.weight as F16. This results in a net gain of about 55mb for a 7B model compared to previous approach, but should minimize adverse impact to model quality.

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Jul 11, 2023
bbef282
zip
tar.gz

master-2347463

Support using mmap when applying LoRA (ggerganov#2095)

* Support using mmap when applying LoRA

* Fix Linux

* Update comment to reflect the support lora with mmap

Jul 11, 2023
2347463
zip
tar.gz

master-20d7740

ggml : sync (abort callback, mul / add broadcast, fix alibi) (ggergan…

…ov#2183)

Jul 11, 2023
20d7740
zip
tar.gz

master-5bf2a27

ggml : remove src0 and src1 from ggml_tensor and rename opt to src (g…

…gerganov#2178)

* Add ggml changes

* Update train-text-from-scratch for change

* mpi : adapt to new ggml_tensor->src

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Jul 11, 2023
5bf2a27
zip
tar.gz

master-5656d10

mpi : add support for distributed inference via MPI (ggerganov#2099)

* MPI support, first cut

* fix warnings, update README

* fixes

* wrap includes

* PR comments

* Update CMakeLists.txt

* Add GH workflow, fix test

* Add info to README

* mpi : trying to move more MPI stuff into ggml-mpi (WIP) (ggerganov#2099)

* mpi : add names for layer inputs + prep ggml_mpi_graph_compute()

* mpi : move all MPI logic into ggml-mpi

Not tested yet

* mpi : various fixes - communication now works but results are wrong

* mpi : fix output tensor after MPI compute (still not working)

* mpi : fix inference

* mpi : minor

* Add OpenMPI to GH action

* [mpi] continue-on-error: true

* mpi : fix after master merge

* [mpi] Link MPI C++ libraries to fix OpenMPI

* tests : fix new llama_backend API

* [mpi] use MPI_INT32_T

* mpi : factor out recv / send in functions and reuse

* mpi : extend API to allow usage with outer backends (e.g. Metal)

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Jul 10, 2023
5656d10
zip
tar.gz

master-db4047a

main : escape prompt prefix/suffix (ggerganov#2151)

Jul 9, 2023
db4047a
zip
tar.gz

master-3bbc1a1

ggml : fix buidling with Intel MKL but ask for "cblas.h" issue (ggerg…

…anov#2104) (ggerganov#2115)

* Fix buidling with Intel MKL but ask for "cblas.h" issue

* Use angle brackets to indicate the system library

Jul 9, 2023
3bbc1a1
zip
tar.gz

master-1d16309

llama : remove "first token must be BOS" restriction (ggerganov#2153)

Jul 9, 2023
1d16309
zip
tar.gz

master-6463955

Fixed OpenLLaMA 3b CUDA mul_mat_vec_q (ggerganov#2144)

Jul 8, 2023
6463955
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

master-c9c74b4

master-bbef282

master-2347463

master-20d7740

master-5bf2a27

master-5656d10

master-db4047a

master-3bbc1a1

master-1d16309

master-6463955

Tags: lshzh-ww/llama.cpp