Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --n-predict -2 for stopping generation on full context #2565

Merged

Conversation

crasm
Copy link
Contributor

@crasm crasm commented Aug 9, 2023

This is necessary to mitigate #1730 until some sliding context window or something can be implemented in the future. Massively increases throughput of batch jobs if doing a lot of generations in a shell script loop that typically fill the context.

@ejones
Copy link
Collaborator

ejones commented Aug 9, 2023

This is huge for examples/chat-persistent.sh too, thanks!

@JohannesGaessler JohannesGaessler merged commit e59fcb2 into ggerganov:master Aug 10, 2023
25 checks passed
YellowRoseCx pushed a commit to YellowRoseCx/koboldcpp-rocm that referenced this pull request Aug 12, 2023
YellowRoseCx added a commit to YellowRoseCx/koboldcpp-rocm that referenced this pull request Aug 25, 2023
commit 3416c986d9d9a31c3cdefd7e7bd4d9438d72ba35
Merge: 5eb17f0 4c4e435
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:46:56 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 5eb17f02c8638e003bb91bddf95ccf54d2ad0c12
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:38:21 2023 -0500

    ROCm Port update

    * use hipblas based on cublas
    * Update Makefile for the Cuda kernels
    * Expand arch list and make it overrideable
    * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)
    * add hipBLAS to README
    * new build arg LLAMA_CUDA_MMQ_Y
    * fix half2 decomposition
    * Add intrinsics polyfills for AMD
    * AMD assembly optimized __dp4a
    * Allow overriding CC_TURING
    * use "ROCm" instead of "CUDA"
    * ignore all build dirs
    * Add Dockerfiles
    * fix llama-bench
    * fix -nommq help for non CUDA/HIP

    ---------

    Co-Authored-By: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
    Co-Authored-By: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-Authored-By: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-Authored-By: Engininja2 <139037756+Engininja2@users.noreply.github.com>
    Co-Authored-By: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
    Co-Authored-By: jammm <2500920+jammm@users.noreply.github.com>
    Co-Authored-By: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit 4c4e4358ed54c397d3f0f5bc268f1ac59d909f57
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 22:12:56 2023 +0800

    fixed linux build error

commit 661bede62fe216632d099678a9dac08de7a68a4e
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 21:16:16 2023 +0800

    optimize tokenize method

commit b95a4ccb228ebfac12e5ce4b445f073ca67b99d2
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 20:41:49 2023 +0800

    added a token counting endpoint, set mmq as default

commit 81a0ef342ce1e583f6a5b060252565dbd59e1d8d
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 16:26:38 2023 +0800

    updated lite, switched to unminified source

commit 598d4d89ab3aaa539ddf36784306071f1411814a
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 15:45:33 2023 +0800

    fix for config file loading. from kcpp settings file

commit a3b994962673e681aafd9503781c7470acdcc63f
Merge: b8372d4 2d86b2e
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 15:22:17 2023 +0800

    Merge remote-tracking branch 'pop/add_config_arg' into concedo_experimental

commit b8372d44666531f5d17cbe264912fbe5548fd54b
Merge: 8263fd7 6e91a1b
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 24 15:21:24 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	.gitignore
    #	README.md
    #	tests/CMakeLists.txt

commit 6e91a1b0706c2e0e52b9d9be7ee82d3c1e7a33c1
Author: Evan Jones <evan.q.jones@gmail.com>
Date:   Thu Aug 24 00:07:13 2023 -0400

    llama : fix grammar sometimes generating null char (#2756)

commit 44d5462b5cddc1c5cbcd7647646f7b55b175b01f
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Wed Aug 23 23:44:19 2023 +0300

    readme : fix link

commit c7868b075377c8c3fa916ea7c1aca600f44bed55
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Wed Aug 23 23:43:00 2023 +0300

    minor : fix trailing whitespace

commit 79da24b58c1ea72340e64f799a4717d372207676
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Wed Aug 23 23:41:16 2023 +0300

    readme : update hot topics

commit cf658adc832badaaa2ca119fe86070e5a830f8f6
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Wed Aug 23 23:08:04 2023 +0300

    llm : add Falcon support (#2717)

    * llama : refactor GGUF constants into static maps

    * llama : check if model architecture is known

    * llama : refactor llama_model_load_internal()

    * gguf : add KV constant maps

    * llm : read arch-specific KVs

    * convert : add dummy scores + types

    * falcon : load tensor data (CPU only)

    * llama : fix loading progress bar

    * llama : add arch member to llama_model

    * falcon : CPU inference working

    * falcon : support non-40B models

    * falcon : minor

    * llama : minor updates

    ggml-ci

    * convert-falcon-hf-to-gguf.py : fix special token mapping

    * llama.cpp : llama default UNK token = id 0

    * llama.cpp : fix bpe tokenizer

    * llama.cpp : fix the fix of bpe tokenizer

    * ggml : pass eps to ggml_norm

    * metal : implement RoPE (mode = 2) + avoid ggml_repeat

    * ggml : ggml_repeat always creates new tensor

    * falcon : copy-paste self-attention from LLaMA

    * metal : print extra compute pipeline info

    * falcon : minor changes (still chasing the Metal problem)

    * llama.cpp : fix linefeed token

    * metal : fix GELU kernel numerical stability by using precise::tanh

    * metal : temporary workaround for the concurrency optimization bug

    * falcon : add CUDA offloading (#2739)

    * llama : better model naming and size reporting

    * llama : prep new tokenizer support

    * llama : advanced BPE tokenizer based on ggllm.cpp imlpementation

    * llama : remove oboslete comment

    ggml-ci

    * common : remove obsolete BPE API + disable test-tokenizer-1

    * llama : revert BPE special-case in llama_byte_to_token()

    * cuda : add TODOs for RoPE NeoX implementation

    * llama : default special tokens based on vocab type

    * perplexity : add log for start of tokenization

    ---------

    Co-authored-by: klosax <131523366+klosax@users.noreply.github.com>
    Co-authored-by: slaren <slarengh@gmail.com>

commit a192860cfec89a38d59a943623bf595b1fe4495b
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Wed Aug 23 22:37:39 2023 +0300

    minor : fix trailing whitespace

commit 95385241a91a616788a3bb76d12c9b7b2379ca2d
Author: Olivier Chafik <ochafik@users.noreply.github.com>
Date:   Wed Aug 23 20:33:05 2023 +0100

    examples : restore the functionality to import llama2.c models (#2685)

    * Fix import of llama2.c models that don't share weights between embedding layers

    * llama2c: reinstate ggmlv3 conversion output + update readme w/ gguf conv

    * llama2.c: comment out legacy "load from ggml model" logic

    * llama2.c: convert special-cased "<0xXX>" single byte tokens from tokenizer.bin

commit 335acd2ffd7b04501c6d8773ab9fcee6e7bf8639
Author: slaren <slarengh@gmail.com>
Date:   Wed Aug 23 16:46:54 2023 +0200

    fix convert-lora-to-ggml.py (#2738)

commit 5290c38e6e9b66ee2b543e560e301c1a1a90929c
Author: klosax <131523366+klosax@users.noreply.github.com>
Date:   Wed Aug 23 16:46:03 2023 +0200

    main : insert bos if no tokens (#2727)

    * main.cpp : insert bos if no tokens

    * Update examples/main/main.cpp

    * Update examples/main/main.cpp

    ---------

    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

commit cc34dbda9681418a2b18382446b90cdcec398d82
Author: akawrykow <142945436+akawrykow@users.noreply.github.com>
Date:   Wed Aug 23 07:31:34 2023 -0700

    gitignore : fix for windows (#2729)

commit 7c2227a1972a4add4b5c118e4914c086513d0382
Author: Cebtenzzre <cebtenzzre@gmail.com>
Date:   Wed Aug 23 10:29:09 2023 -0400

    chmod : make scripts executable (#2675)

commit f19dca04ea5fbf9a0b2753091d93464585d5c73b
Author: JohnnyB <jboero@users.noreply.github.com>
Date:   Wed Aug 23 15:28:22 2023 +0100

    devops : RPM Specs (#2723)

    * Create llama-cpp.srpm

    * Rename llama-cpp.srpm to llama-cpp.srpm.spec

    Correcting extension.

    * Tested spec success.

    * Update llama-cpp.srpm.spec

    * Create lamma-cpp-cublas.srpm.spec

    * Create lamma-cpp-clblast.srpm.spec

    * Update lamma-cpp-cublas.srpm.spec

    Added BuildRequires

    * Moved to devops dir

commit 8263fd7bdb247f2c3ff21debb50b22bd9b030339
Author: askmyteapot <62238146+askmyteapot@users.noreply.github.com>
Date:   Thu Aug 24 00:15:48 2023 +1000

    Update llama_v3.cpp (#393)

    Fixing C2065 compiler error.
    Missed '3' on 3 separate identifiers (kB > kB3, MB > MB3)

commit bfdc596d58fbd9bbadd2352705af4373005e1411
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 23 19:19:52 2023 +0800

    gguf reader in file format detection

commit 8207214b6a37a46526cee9e72d4c9092b9d1872f
Author: Kawrakow <48489457+ikawrakow@users.noreply.github.com>
Date:   Wed Aug 23 12:57:12 2023 +0300

    Fix values shown in the quantize tool help (#2735)

    Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

commit 62959e740e8759d246ac8d09036950efde09981c
Author: Kawrakow <48489457+ikawrakow@users.noreply.github.com>
Date:   Wed Aug 23 12:56:42 2023 +0300

    Strided perplexity (#2714)

    * Implementing strided computation of perplexity

    * Alternative way to output PPL results

    ---------

    Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

commit 7f7ddd5002040804e33fcdbde44aa22f8635f57d
Author: IgnacioFDM <ignaciofdm@gmail.com>
Date:   Wed Aug 23 06:31:09 2023 -0300

    Fix ggml to gguf conversion on Windows (#2733)

    This fixes `RuntimeWarning: overflow encountered in long_scalars`

    Credit: anon (not mine)

commit af170fc2db1186d3002b602d909c52c22de4a076
Merge: 981c913 b8ad1b6
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 23 17:08:09 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	README.md
    #	llama.cpp
    #	scripts/sync-ggml.sh
    #	tests/test-tokenizer-0.cpp

commit 981c9131f0f20c10099735c1e353534b5bfe1e59
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 23 16:07:07 2023 +0800

    gguf for llama is working

commit b8ad1b66b23f9b2e6e4531e9a62753323036a556
Author: Xiao-Yong Jin <jinxiaoyong@gmail.com>
Date:   Wed Aug 23 02:12:12 2023 -0500

    server : allow json array in prompt or content for direct token input (#2306)

    * server: allow json array in prompt or content

    We accept an array of strings and numbers representing tokens,
    in addition to the current string valued prompt or content.

    This allows direct token input, so that any special tokens
    can be processed and used at the frontend during the construction
    of the json data, before sending to the server. And the server
    does not need to know or parse special tokens from textual input.

    With this, we can use EOS and BOS used in llama-2-chat models.

    * server: use tokenizePrompt(json) and default "" if empty prompt

    * server: fix prompt check

    * server: tokenize endpoint no longer adds BOS

commit f5fe98d11bdf9e7797bcfb05c0c3601ffc4b9d26
Author: Evan Jones <evan.q.jones@gmail.com>
Date:   Tue Aug 22 21:01:57 2023 -0400

    docs : add grammar docs (#2701)

    * docs : add grammar docs

    * tweaks to grammar guide

    * rework GBNF example to be a commented grammar

commit 777f42ba18b29f25c71ff8de3ecf97b8017304c0
Author: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Date:   Tue Aug 22 17:39:39 2023 -0600

    Improve handling of special tokens in GGML to GGUF converter (#2725)

    * Improve UNK, BOS, EOS token handling when converting without metadata.

    * Allow importing as a module.

    * Remove some obsolete code and minor cleanups.

    * Set default UNK token mapping from -1 to 0 in llama.cpp

    * Try to handle overflow due to buggy Windows Python with a better error message

commit 46ef5b5fcf4c366e1fb27726b6394adbbf8fd0ea
Author: goerch <jhr.walter@t-online.de>
Date:   Tue Aug 22 23:10:42 2023 +0200

    llama : fix whitespace escaping in tokenizer (#2724)

commit c63bb1d16a70c03440671b76954bb767513cead8
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Tue Aug 22 22:47:05 2023 +0200

    CUDA: use mul_mat_q kernels by default (#2683)

commit 3b6cfe7c927df178ca3c11643c3ec93e143471c9
Author: Alex Petenchea <alex.petenchea@gmail.com>
Date:   Tue Aug 22 21:58:16 2023 +0300

    convert.py : clarifying error message (#2718)

commit 800c9635b4a9390126f397870f3a825fc7455bd1
Author: Jiahao Li <liplus17@163.com>
Date:   Wed Aug 23 02:27:06 2023 +0800

    Fix CUDA softmax by subtracting max value before exp (#2665)

commit deb7dfca4b9725cd295d1426db75fe8e0a6d5312
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Tue Aug 22 20:05:59 2023 +0300

    gguf : add ftype meta info to the model (#2710)

    * llama : add ftype meta info to the model

    ggml-ci

    * convert.py : add ftype when converting (does not work)

    * convert.py : fix Enum to IntEnum

    ggml-ci

commit bac66994cf356cf488078c056831396eb4ce31d5
Author: Kawrakow <48489457+ikawrakow@users.noreply.github.com>
Date:   Tue Aug 22 19:14:09 2023 +0300

    Quantization imrovements for k_quants (#2707)

    * Improve LLaMA-2 2-, 3- and 4-bit quantization

    * Q3_K_S: use Q5_K for 1st 2 layers of attention.wv and feed_forward.w2
    * Q4_K_S: use Q6_K for 1st 2 layers of attention.wv and feed_forward.w2
    * Q2_K and Q3_K_M: use Q5_K instead of Q4_K for 1st 2 layers of
      attention.wv and feed_forward.w2

    This leads to a slight model sized increase as follows:
    Q2_K  : 2.684G vs 2.670G
    Q3_K_S: 2.775G vs 2.745G
    Q3_K_M: 3.071G vs 3.057G
    Q4_K_S: 3.592G vs 3.563G

    LLaMA-2 PPL for context 512 changes as follows:
    Q2_K  : 6.6691 vs 6.8201
    Q3_K_S: 6.2129 vs 6.2584
    Q3_K_M: 6.0387 vs 6.1371
    Q4_K_S: 5.9138 vs 6.0041

    There are improvements for LLaMA-1 as well, but they are
    way smaller than the above.

    * Minor 4-bit quantization improvement

    For the same model size as previus commit, we get
    PPL = 5.9069 vs 5.9138.

    * Some more fine tuning

    * Adding make_qkx2_quants

    With it, we get PPL = 5.8828 for L2-7B Q4_K_S.

    * Another minor improvement

    * Q2_K improvement

    Smaller model, lower perplexity.
     7B: file size = 2.632G, PPL = 6.3772 vs original 2.670G PPL = 6.8201
    12B: file size = 5.056G, PPL = 5.4577 vs original 5.130G PPL = 5.7178

    It is mostly Q3_K except for tok_embeddings, attention.wq, attention.wk,
    which are Q2_K

    * Iterating

    * Revert Q5_K back to make_qkx1_quants

    * Better Q6_K

    * make_qkx2_quants is better for Q5_K after all

    * Fix after rebasing on master

    * Fix for changed tensor names

    ---------

    Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

commit 39cc83e8c9fafe1494c4996b07f97afed29c9f27
Merge: 2d17c22 6381d4e
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Tue Aug 22 23:12:47 2023 +0800

    incomplete merge, compiles but generates rubbish

commit 519c981f8b65ee6c87c2965539685ced0a17223b
Author: slaren <slarengh@gmail.com>
Date:   Tue Aug 22 16:03:12 2023 +0200

    embedding : evaluate prompt in batches (#2713)

commit 1123f7fbdfb8012e46f05e903e6f675922916378
Author: slaren <slarengh@gmail.com>
Date:   Tue Aug 22 15:25:19 2023 +0200

    ggml-cuda : use graph allocator (#2684)

    use a different function for no_alloc to avoid breaking backwards compat, fixes lora

    remove 512 n_batch limit

    fixed 2048 batch size

    cleanup

    Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

commit ef3f333d3775600d1646a9fa249aca532d15fb89
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Tue Aug 22 14:22:08 2023 +0300

    ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709)

    * ggml : sync latest (SAM + SD operators, CUDA alibi)

    ggml-ci

    * ggml : fix tabs

commit 2d17c224376c0fb2d6cfce8726de5a5f7b666bfe
Merge: 36b0c5b dadbed9
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Tue Aug 22 18:20:06 2023 +0800

    functional commit before gguf merge

commit 8e4364f2af9cd5d57240f23e83c0e29bc068bc02
Author: slaren <slarengh@gmail.com>
Date:   Tue Aug 22 09:56:03 2023 +0200

    llama-bench : minor fixes (#2695)

commit 1e3bc523d8053a77df3ac7126a84d0297ee97ef6
Author: Kylin <56434533+KyL0N@users.noreply.github.com>
Date:   Tue Aug 22 15:14:23 2023 +0800

    ggml : support CUDA's half type for aarch64(#1455) (#2670)

    * ggml: support CUDA's half type for aarch64(#1455)
    support CUDA's half type for aarch64 in ggml_fp16_t definition

    * ggml: use __CUDACC__ to recognise nvcc compiler

commit 14b1d7e6f720dee41ce5a826376df738096d9033
Author: Shouzheng Liu <lshzh.hi@gmail.com>
Date:   Tue Aug 22 02:18:40 2023 -0400

    metal : add missing barriers for mul-mat (#2699)

commit 226255b44ef2c2794bfac48d101d35a9c2dbb965
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Tue Aug 22 08:32:00 2023 +0800

    server : fallback to default if client param is null (#2688)

    * server : fallback to default if client param is null

    * server : do not overwrite 404 if status is 500 from exception_handler

commit 930523c8e1cbbee5449c055daa894917fac6805e
Author: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Date:   Mon Aug 21 18:01:34 2023 -0600

    Fix convert-llama-ggmlv3-to-gguf.py vocab conversion (#2698)

    When converting without metadata, the hex value for bytes entries weren't 0 padded to 2 digits.

commit 2d86b2e219ef988878bdea7e33a534aad3a744da
Author: Pontus Mårdnäs <pontus@mardnas.se>
Date:   Mon Aug 21 23:46:56 2023 +0200

    Add --config argument

commit c8dba409e6d6a754090f08e6a862c5ffdd52e421
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 21 23:40:22 2023 +0300

    py : remove obsolete script

commit 6381d4e110bd0ec02843a60bbeb8b6fc37a9ace9
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 21 23:07:43 2023 +0300

    gguf : new file format with flexible meta data (beta) (#2398)

    * gguf : first API pass

    * gguf : read header + meta data

    * gguf : read tensor info

    * gguf : initial model loading - not tested

    * gguf : add gguf_get_tensor_name()

    * gguf : do not support passing existing ggml_context to gguf_init

    * gguf : simplify gguf_get_val

    * gguf : gguf.c is now part of ggml.c

    * gguf : read / write sample models

    * gguf : add comments

    * refactor : reduce code duplication and better API (#2415)

    * gguf : expose the gguf_type enum through the API for now

    * gguf : add array support

    * gguf.py : some code style changes

    * convert.py : start a new simplified implementation by removing old stuff

    * convert.py : remove GGML vocab + other obsolete stuff

    * GGUF : write tensor (#2426)

    * WIP: Write tensor

    * GGUF : Support writing tensors in Python

    * refactor : rm unused import and upd todos

    * fix : fix errors upd writing example

    * rm example.gguf

    * gitignore *.gguf

    * undo formatting

    * gguf : add gguf_find_key (#2438)

    * gguf.cpp : find key example

    * ggml.h : add gguf_find_key

    * ggml.c : add gguf_find_key

    * gguf : fix writing tensors

    * gguf : do not hardcode tensor names to read

    * gguf : write sample tensors to read

    * gguf : add tokenization constants

    * quick and dirty conversion example

    * gguf : fix writing gguf arrays

    * gguf : write tensors one by one and code reuse

    * gguf : fix writing gguf arrays

    * gguf : write tensors one by one

    * gguf : write tensors one by one

    * gguf : write tokenizer data

    * gguf : upd gguf conversion script

    * Update convert-llama-h5-to-gguf.py

    * gguf : handle already encoded string

    * ggml.h : get array str and f32

    * ggml.c : get arr str and f32

    * gguf.py : support any type

    * Update convert-llama-h5-to-gguf.py

    * gguf : fix set is not subscriptable

    * gguf : update convert-llama-h5-to-gguf.py

    * constants.py : add layer norm eps

    * gguf.py : add layer norm eps and merges

    * ggml.h : increase GGML_MAX_NAME to 64

    * ggml.c : add gguf_get_arr_n

    * Update convert-llama-h5-to-gguf.py

    * add gptneox gguf example

    * Makefile : add gptneox gguf example

    * Update convert-llama-h5-to-gguf.py

    * add gptneox gguf example

    * Update convert-llama-h5-to-gguf.py

    * Update convert-gptneox-h5-to-gguf.py

    * Update convert-gptneox-h5-to-gguf.py

    * Update convert-llama-h5-to-gguf.py

    * gguf : support custom alignment value

    * gguf : fix typo in function call

    * gguf : mmap tensor data example

    * fix : update convert-llama-h5-to-gguf.py

    * Update convert-llama-h5-to-gguf.py

    * convert-gptneox-h5-to-gguf.py : Special tokens

    * gptneox-main.cpp : special tokens

    * Update gptneox-main.cpp

    * constants.py : special tokens

    * gguf.py : accumulate kv and tensor info data + special tokens

    * convert-gptneox-h5-to-gguf.py : accumulate kv and ti + special tokens

    * gguf : gguf counterpart of llama-util.h

    * gguf-util.h : update note

    * convert-llama-h5-to-gguf.py : accumulate kv / ti + special tokens

    * convert-llama-h5-to-gguf.py : special tokens

    * Delete gptneox-common.cpp

    * Delete gptneox-common.h

    * convert-gptneox-h5-to-gguf.py : gpt2bpe tokenizer

    * gptneox-main.cpp : gpt2 bpe tokenizer

    * gpt2 bpe tokenizer (handles merges and unicode)

    * Makefile : remove gptneox-common

    * gguf.py : bytesarray for gpt2bpe tokenizer

    * cmpnct_gpt2bpe.hpp : comments

    * gguf.py : use custom alignment if present

    * gguf : minor stuff

    * Update gptneox-main.cpp

    * map tensor names

    * convert-gptneox-h5-to-gguf.py : map tensor names

    * convert-llama-h5-to-gguf.py : map tensor names

    * gptneox-main.cpp : map tensor names

    * gguf : start implementing libllama in GGUF (WIP)

    * gguf : start implementing libllama in GGUF (WIP)

    * rm binary commited by mistake

    * upd .gitignore

    * gguf : calculate n_mult

    * gguf :  inference with 7B model working (WIP)

    * gguf : rm deprecated function

    * gguf : start implementing gguf_file_saver (WIP)

    * gguf : start implementing gguf_file_saver (WIP)

    * gguf : start implementing gguf_file_saver (WIP)

    * gguf : add gguf_get_kv_type

    * gguf : add gguf_get_kv_type

    * gguf : write metadata in gguf_file_saver (WIP)

    * gguf : write metadata in gguf_file_saver (WIP)

    * gguf : write metadata in gguf_file_saver

    * gguf : rm references to old file formats

    * gguf : shorter name for member variable

    * gguf : rm redundant method

    * gguf : get rid of n_mult, read n_ff from file

    * Update gguf_tensor_map.py

    * Update gptneox-main.cpp

    * gguf : rm references to old file magics

    * gguf : start implementing quantization (WIP)

    * gguf : start implementing quantization (WIP)

    * gguf : start implementing quantization (WIP)

    * gguf : start implementing quantization (WIP)

    * gguf : start implementing quantization (WIP)

    * gguf : start implementing quantization (WIP)

    * gguf : quantization is working

    * gguf : roper closing of file

    * gguf.py : no need to convert tensors twice

    * convert-gptneox-h5-to-gguf.py : no need to convert tensors twice

    * convert-llama-h5-to-gguf.py : no need to convert tensors twice

    * convert-gptneox-h5-to-gguf.py : simplify nbytes

    * convert-llama-h5-to-gguf.py : simplify nbytes

    * gptneox-main.cpp : n_layer --> n_block

    * constants.py : n_layer --> n_block

    * gguf.py : n_layer --> n_block

    * convert-gptneox-h5-to-gguf.py : n_layer --> n_block

    * convert-llama-h5-to-gguf.py : n_layer --> n_block

    * gptneox-main.cpp : n_layer --> n_block

    * Update gguf_tensor_map.py

    * convert-gptneox-h5-to-gguf.py : load model in parts to save memory

    * convert-llama-h5-to-gguf.py : load model in parts to save memory

    * convert : write more metadata for LLaMA

    * convert : rm quantization version

    * convert-gptneox-h5-to-gguf.py : add file_type key

    * gptneox-main.cpp : add file_type key

    * fix conflicts

    * gguf : add todos and comments

    * convert-gptneox-h5-to-gguf.py : tensor name map changes

    * Create gguf_namemap.py : tensor name map changes

    * Delete gguf_tensor_map.py

    * gptneox-main.cpp : tensor name map changes

    * convert-llama-h5-to-gguf.py : fixes

    * gguf.py : dont add empty strings

    * simple : minor style changes

    * gguf : use UNIX line ending

    * Create convert-llama-7b-pth-to-gguf.py

    * llama : sync gguf-llama.cpp with latest llama.cpp (#2608)

    * llama : sync gguf-llama.cpp with latest llama.cpp

    * minor : indentation + assert

    * llama : refactor gguf_buffer and gguf_ctx_buffer

    * llama : minor

    * gitignore : add gptneox-main

    * llama : tokenizer fixes (#2549)

    * Merge tokenizer fixes into the gguf branch.

    * Add test vocabularies

    * convert : update convert-new.py with tokenizer fixes (#2614)

    * Merge tokenizer fixes into the gguf branch.

    * Add test vocabularies

    * Adapt convert-new.py (and fix a clang-cl compiler error on windows)

    * llama : sync gguf-llama with llama (#2613)

    * llama : sync gguf-llama with llama

    * tests : fix build + warnings (test-tokenizer-1 still fails)

    * tests : fix wstring_convert

    * convert : fix layer names

    * llama : sync gguf-llama.cpp

    * convert : update HF converter to new tokenizer voodoo magics

    * llama : update tokenizer style

    * convert-llama-h5-to-gguf.py : add token types

    * constants.py : add token types

    * gguf.py : add token types

    * convert-llama-7b-pth-to-gguf.py : add token types

    * gguf-llama.cpp :  fix n_head_kv

    * convert-llama-h5-to-gguf.py : add 70b gqa support

    * gguf.py : add tensor data layout

    * convert-llama-h5-to-gguf.py : add tensor data layout

    * convert-llama-7b-pth-to-gguf.py : add tensor data layout

    * gptneox-main.cpp : add tensor data layout

    * convert-llama-h5-to-gguf.py : clarify the reverse permute

    * llama : refactor model loading code (#2620)

    * llama : style formatting + remove helper methods

    * llama : fix quantization using gguf tool

    * llama : simplify gguf_file_saver

    * llama : fix method names

    * llama : simplify write_header()

    * llama : no need to pass full file loader to the file saver

    just gguf_ctx

    * llama : gguf_file_saver write I32

    * llama : refactor tensor names (#2622)

    * gguf: update tensor names searched in quantization

    * gguf : define tensor names as constants

    * gguf : initial write API (not tested yet)

    * gguf : write to file API (not tested)

    * gguf : initial write API ready + example

    * gguf : fix header write

    * gguf : fixes + simplify example + add ggml_nbytes_pad()

    * gguf : minor

    * llama : replace gguf_file_saver with new gguf write API

    * gguf : streaming support when writing files

    * gguf : remove oboslete write methods

    * gguf : remove obosolete gguf_get_arr_xxx API

    * llama : simplify gguf_file_loader

    * llama : move hparams and vocab from gguf_file_loader to llama_model_loader

    * llama : merge gguf-util.h in llama.cpp

    * llama : reorder definitions in .cpp to match .h

    * llama : minor simplifications

    * llama : refactor llama_model_loader (WIP)

    wip : remove ggml_ctx from llama_model_loader

    wip : merge gguf_file_loader in llama_model_loader

    * llama : fix shape prints

    * llama : fix Windows build + fix norm_rms_eps key

    * llama : throw error on missing KV paris in model meta data

    * llama : improve printing + log meta data

    * llama : switch print order of meta data

    ---------

    Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>

    * gguf : deduplicate (#2629)

    * gguf : better type names

    * dedup : CPU + Metal is working

    * ggml : fix warnings about unused results

    * llama.cpp : fix line feed and compiler warning

    * llama : fix strncpy warning + note token_to_str does not write null

    * llama : restore the original load/save session implementation

    Will migrate this to GGUF in the future

    * convert-llama-h5-to-gguf.py : support alt ctx param name

    * ggml : assert when using ggml_mul with non-F32 src1

    * examples : dedup simple

    ---------

    Co-authored-by: klosax <131523366+klosax@users.noreply.github.com>

    * gguf.py : merge all files in gguf.py

    * convert-new.py : pick #2427 for HF 70B support

    * examples/gguf : no need to keep q option for quantization any more

    * llama.cpp : print actual model size

    * llama.cpp : use ggml_elements()

    * convert-new.py : output gguf (#2635)

    * convert-new.py : output gguf (WIP)

    * convert-new.py : add gguf key-value pairs

    * llama : add hparams.ctx_train + no longer print ftype

    * convert-new.py : minor fixes

    * convert-new.py : vocab-only option should work now

    * llama : fix tokenizer to use llama_char_to_byte

    * tests : add new ggml-vocab-llama.gguf

    * convert-new.py : tensor name mapping

    * convert-new.py : add map for skipping tensor serialization

    * convert-new.py : convert script now works

    * gguf.py : pick some of the refactoring from #2644

    * convert-new.py : minor fixes

    * convert.py : update to support GGUF output

    * Revert "ci : disable CI temporary to not waste energy"

    This reverts commit 7e82d25f40386540c2c15226300ad998ecd871ea.

    * convert.py : n_head_kv optional and .gguf file extension

    * convert.py : better always have n_head_kv and default it to n_head

    * llama : sync with recent PRs on master

    * editorconfig : ignore models folder

    ggml-ci

    * ci : update ".bin" to ".gguf" extension

    ggml-ci

    * llama : fix llama_model_loader memory leak

    * gptneox : move as a WIP example

    * llama : fix lambda capture

    ggml-ci

    * ggml : fix bug in gguf_set_kv

    ggml-ci

    * common.h : .bin --> .gguf

    * quantize-stats.cpp : .bin --> .gguf

    * convert.py : fix HF tensor permuting / unpacking

    ggml-ci

    * llama.cpp : typo

    * llama : throw error if gguf fails to init from file

    ggml-ci

    * llama : fix tensor name grepping during quantization

    ggml-ci

    * gguf.py : write tensors in a single pass (#2644)

    * gguf : single pass for writing tensors + refactoring writer

    * gguf : single pass for writing tensors + refactoring writer

    * gguf : single pass for writing tensors + refactoring writer

    * gguf : style fixes in simple conversion script

    * gguf : refactor gptneox conversion script

    * gguf : rename h5 to hf (for HuggingFace)

    * gguf : refactor pth to gguf conversion script

    * gguf : rm file_type key and method

    * gguf.py : fix vertical alignment

    * gguf.py : indentation

    ---------

    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

    * convert-gptneox-hf-to-gguf.py : fixes

    * gguf.py : gptneox mapping

    * convert-llama-hf-to-gguf.py : fixes

    * convert-llama-7b-pth-to-gguf.py : fixes

    * ggml.h : reverse GGUF_MAGIC

    * gguf.py : reverse GGUF_MAGIC

    * test-tokenizer-0.cpp : fix warning

    * llama.cpp : print kv general.name

    * llama.cpp : get special token kv and linefeed token id

    * llama : print number of tensors per type + print arch + style

    * tests : update vocab file with new magic

    * editorconfig : fix whitespaces

    * llama : re-order functions

    * llama : remove C++ API + reorganize common source in /common dir

    * llama : minor API updates

    * llama : avoid hardcoded special tokens

    * llama : fix MPI build

    ggml-ci

    * llama : introduce enum llama_vocab_type + remove hardcoded string constants

    * convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested

    * falcon-main.cpp : falcon inference example

    * convert-falcon-hf-to-gguf.py : remove extra kv

    * convert-gptneox-hf-to-gguf.py : remove extra kv

    * convert-llama-7b-pth-to-gguf.py : remove extra kv

    * convert-llama-hf-to-gguf.py : remove extra kv

    * gguf.py : fix for falcon 40b

    * falcon-main.cpp : fix for falcon 40b

    * convert-falcon-hf-to-gguf.py : update ref

    * convert-falcon-hf-to-gguf.py : add tensor data layout

    * cmpnct_gpt2bpe.hpp : fixes

    * falcon-main.cpp : fixes

    * gptneox-main.cpp : fixes

    * cmpnct_gpt2bpe.hpp : remove non-general stuff

    * Update examples/server/README.md

    Co-authored-by: slaren <slarengh@gmail.com>

    * cmpnct_gpt2bpe.hpp : cleanup

    * convert-llama-hf-to-gguf.py : special tokens

    * convert-llama-7b-pth-to-gguf.py : special tokens

    * convert-permute-debug.py : permute debug print

    * convert-permute-debug-master.py : permute debug for master

    * convert-permute-debug.py : change permute type of attn_q

    * convert.py : 70b model working (change attn_q permute)

    * Delete convert-permute-debug-master.py

    * Delete convert-permute-debug.py

    * convert-llama-hf-to-gguf.py : fix attn_q permute

    * gguf.py : fix rope scale kv

    * convert-llama-hf-to-gguf.py : rope scale and added tokens

    * convert-llama-7b-pth-to-gguf.py : rope scale and added tokens

    * llama.cpp : use rope scale kv

    * convert-llama-7b-pth-to-gguf.py : rope scale fix

    * convert-llama-hf-to-gguf.py : rope scale fix

    * py : fix whitespace

    * gguf : add Python script to convert GGMLv3 LLaMA models to GGUF (#2682)

    * First pass at converting GGMLv3 LLaMA models to GGUF

    * Cleanups, better output during conversion

    * Fix vocab space conversion logic

    * More vocab conversion fixes

    * Add description to converted GGUF files

    * Improve help text, expand warning

    * Allow specifying name and description for output GGUF

    * Allow overriding vocab and hyperparams from original model metadata

    * Use correct params override var name

    * Fix wrong type size for Q8_K

    Better handling of original style metadata

    * Set default value for gguf add_tensor raw_shape KW arg

    * llama : improve token type support (#2668)

    * Merge tokenizer fixes into the gguf branch.

    * Add test vocabularies

    * Adapt convert-new.py (and fix a clang-cl compiler error on windows)

    * Improved tokenizer test

    But does it work on MacOS?

    * Improve token type support

    - Added @klosax code to convert.py
    - Improved token type support in vocabulary

    * Exclude platform dependent tests

    * More sentencepiece compatibility by eliminating magic numbers

    * Restored accidentally removed comment

    * llama : add API for token type

    ggml-ci

    * tests : use new tokenizer type API (#2692)

    * Merge tokenizer fixes into the gguf branch.

    * Add test vocabularies

    * Adapt convert-new.py (and fix a clang-cl compiler error on windows)

    * Improved tokenizer test

    But does it work on MacOS?

    * Improve token type support

    - Added @klosax code to convert.py
    - Improved token type support in vocabulary

    * Exclude platform dependent tests

    * More sentencepiece compatibility by eliminating magic numbers

    * Restored accidentally removed comment

    * Improve commentary

    * Use token type API in test-tokenizer-1.cpp

    * py : cosmetics

    * readme : add notice about new file format

    ggml-ci

    ---------

    Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
    Co-authored-by: klosax <131523366+klosax@users.noreply.github.com>
    Co-authored-by: goerch <jhr.walter@t-online.de>
    Co-authored-by: slaren <slarengh@gmail.com>
    Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>

commit dadbed99e65252d79f81101a392d0d6497b86caa
Author: Shouzheng Liu <lshzh.hi@gmail.com>
Date:   Mon Aug 21 06:59:29 2023 -0400

    metal : fix synchronization in new matrix multiplication kernel (#2686)

commit cb1c0727bd59803b439b6a3af121c99e6393ff3d
Author: Kawrakow <48489457+ikawrakow@users.noreply.github.com>
Date:   Mon Aug 21 11:11:31 2023 +0300

    HellaSwag: split token evaluation into batches if needed (#2681)

    Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

commit 9e232f0234073358e7031c1b8d7aa45020469a3b
Author: slaren <slarengh@gmail.com>
Date:   Sun Aug 20 22:17:53 2023 +0200

    ggml : move all type info to ggml_type_traits (#2663)

commit 5e9ff54a675d163d9f42aad1b5b3e734f17b2701
Author: Kawrakow <48489457+ikawrakow@users.noreply.github.com>
Date:   Sun Aug 20 16:44:46 2023 +0300

    More efficient Hellaswag implementation (#2677)

    Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

commit b34f4bd2724733e188ec4f6074042f66a5ed28c9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 19 17:12:52 2023 -0500

    Update README.md

commit 1f0bccb27929e261744c979bc75114955da49e98
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Sat Aug 19 00:45:36 2023 +0300

    server : better default prompt (#2646)

commit f63564adfaa157ca387071d6b9a06cfaef0ef576
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Sat Aug 19 05:41:32 2023 +0800

    server : update xxd usage for older versions compatibility (#2649)

    * server : update xxd usage for older versions compatibility

    * remove unused $func

commit 2d8b76a110d76ff6b5728ff0af8477531e4db60e
Author: Adrian <smith.adriane@gmail.com>
Date:   Fri Aug 18 12:39:22 2023 -0700

    Add link to clojure bindings to Readme. (#2659)

commit 7af633aec339367e36c867ae709088d6a801aa75
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Fri Aug 18 17:48:31 2023 +0300

    readme : incoming BREAKING CHANGE

commit 097e121e2f17ed3541cf02c55ff7e9febc091b19
Author: slaren <slarengh@gmail.com>
Date:   Fri Aug 18 12:44:58 2023 +0200

    llama : add benchmark example (#2626)

    * llama : add benchmark example

    * add to examples CMakeLists.txt

    * fix msvc build

    * add missing include

    * add Bessel's correction to stdev calculation

    Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

    * improve markdown formatting

    * add missing include

    * print warning is NDEBUG is not defined

    * remove n_prompt and n_gen from the matrix, use each value separately instead

    * better checks for non-optimized builds

    * llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call

    * fix json formatting

    * add sql output

    * add basic cpu and gpu info (linx/cuda only)

    * markdown: also show values that differ from the default

    * markdown: add build id

    * cleanup

    * improve formatting

    * formatting

    ---------

    Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

commit eaf98c2649d7da705de255712f0038ac7e47c610
Author: mdrokz <mohammadmunshi@gmail.com>
Date:   Fri Aug 18 15:47:58 2023 +0530

    readme : add link to Rust bindings (#2656)

commit e9b12c332ec6e215fbac4b2ef165353acbcd8319
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Fri Aug 18 12:48:55 2023 +0300

    perplexity : more meaningful ETA number - 2 decimal points

commit 604b8bdfa6320bbcb018eebcc1252dfede603c6b
Author: Evan Jones <evan.q.jones@gmail.com>
Date:   Thu Aug 17 19:54:44 2023 -0400

    Fix unicode in grammars (fixes #2501) (#2553)

    * Fix unicode in grammars (fixes #2501)

    * add more comments

    * fix test-llama-grammar

commit 10151bee2e38b5711335c4a38f6ca93b50223acf
Author: staviq <staviq@gmail.com>
Date:   Thu Aug 17 23:34:01 2023 +0000

    server : support for saving templates in browser LocalStorage (#2486)

    * support for templates in browser LocalStorage

    * sync accepted #2409 fix from upstream

    * convert autosave invocation to useEffect

    * Apply suggestions from code review

    Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>

    * Regen index.html.cpp, suggested from code review

    ---------

    Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>

commit 0992a7b8b18a89e29a205efb48ceb559c9a08203
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Thu Aug 17 23:57:59 2023 +0200

    README: fix LLAMA_CUDA_MMV_Y documentation (#2647)

commit 6ddeefad9b634c5c79e6bcf046523493ff1fdf7d
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 17 23:11:18 2023 +0300

    [Zig] Fixing Zig build and improvements (#2554)

    * Fix zig after console.o was split

    * Better include and flag management

    * Change LTO to option

commit 36b0c5b39816c039a5235733cfcd2b4e32386ff9
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 17 22:45:49 2023 +0800

    fix for incorrect missing backends displayed

commit 8dae7ce68437faf1fa96ec0e7687b8700956ef20
Author: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Date:   Thu Aug 17 07:29:44 2023 -0600

    Add --cfg-negative-prompt-file option for examples (#2591)

    Add --cfg-negative-prompt-file option for examples

commit a73ccf1aa34de49f61bfeb7f8a679c3bfdb3abe3
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Thu Aug 17 10:47:09 2023 +0300

    llama : replace (permute + reshape + view_1d) with (view_3d) (#2538)

    ggml-ci

commit 7cf54e1f746941279d81d485796777c01f88049c
Author: drbh <david.richard.holtz@gmail.com>
Date:   Thu Aug 17 03:41:01 2023 -0400

    tests : adds simple llama grammar tests (#2618)

    * adds simple llama grammar tests

    * fix lint and add Makefile

    * 0 terminate code_points

    * avoid dangling pointers in candidate cleanup

    * cleanup grammar at end of test

commit a872a2b28eaefc8d464eaa535c94deeb501666f9
Author: Shouzheng Liu <lshzh.hi@gmail.com>
Date:   Thu Aug 17 03:35:53 2023 -0400

    ggml-alloc : fix discrepency between measure&eval (#2639)

    The GGML memory allocator consistently places a tensor within the
    optimal-fit memory block, which is the smallest block capable of
    accommodating the tensor's size. During the measurement phase, the final
    block is generously sized, ensuring it never qualifies as the
    optimal-fit block as long as there exists another block capable of
    accommodating the tensor. Nevertheless, in the evaluation phase, the
    last block is constrained in size and could potentially qualify as the
    optimal-fit block. Consequently, there exists the possibility of a
    tensor being allocated to a different region during evaluation, leading
    to more memory fragmentation in our scratch buffer.

    This recent commit guarantees uniform behavior of the allocator across
    both the measurement and evaluation phases, eliminating discrepancies
    between the two.

commit 0919a0f73d95cfb93a1646a1d1741a0615fe2c5e
Author: Kolen Cheung <ickc@users.noreply.github.com>
Date:   Wed Aug 16 21:09:49 2023 +0100

    cmake : install ggml-meta.metal if LLAMA_METAL (#2449)

commit ed53db86c3b0e0815331a96d7a379edb5e62472c
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Thu Aug 17 04:09:03 2023 +0800

    metal : print error of load pipeline state (#2564)

    * metal : print error of load pipeline state

    * metal : return null if load pipeline failed

commit fc8ef549e50087762a0b4f901cd74b2defcc6ae3
Author: Shouzheng Liu <lshzh.hi@gmail.com>
Date:   Wed Aug 16 16:08:28 2023 -0400

    metal : enable ggml-alloc (#2627)

    * metal: enable ggml-alloc

    Make ggml-alloc work with concurrently dispatch.

    * style-fix

    Co-authored-by: slaren <slarengh@gmail.com>

    ---------

    Co-authored-by: slaren <slarengh@gmail.com>
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

commit bf83bff6742c0f1795b4c18695a13a34ac7adf62
Author: Shouzheng Liu <lshzh.hi@gmail.com>
Date:   Wed Aug 16 16:07:04 2023 -0400

    metal : matrix-matrix multiplication kernel (#2615)

    * metal: matrix-matrix multiplication kernel

    This commit removes MPS and uses custom matrix-matrix multiplication
    kernels for all quantization types. This commit also adds grouped-query
    attention to support llama2 70B.

    * metal: fix performance degradation from gqa

    Integers are slow on the GPU, and 64-bit divides are extremely slow.
    In the context of GQA, we introduce a 64-bit divide that cannot be
    optimized out by the compiler, which results in a decrease of ~8% in
    inference performance. This commit fixes that issue by calculating a
    part of the offset with a 32-bit divide. Naturally, this limits the
    size of a single matrix to ~4GB. However, this limitation should
    suffice for the near future.

    * metal: fix bugs for GQA and perplexity test.

    I mixed up ne02 and nb02 in previous commit.

commit 075d079a72c741050a4c31a27530c8af19df70a6
Merge: 469d70b b5ffb28
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 16 10:43:06 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	CMakeLists.txt
    #	Makefile
    #	ggml-cuda.cu
    #	llama-util.h
    #	tests/CMakeLists.txt

commit b5ffb2849d23afe73647f68eec7b68187af09be6
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Tue Aug 15 10:04:58 2023 +0300

    scripts : add helper script to get wikitext

commit 469d70be45dfdac4d926c1326b579e88d0f0e040
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Tue Aug 15 13:49:05 2023 +0800

    add support for precompiled binaries, used as a fallback

commit 7d1196108ad330b32845546fb3472c2172a0b6b8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Aug 14 23:03:12 2023 -0500

    remove force DMMV

commit 3ebb00935f3f0522b75df49c2769ab1774b91380
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Tue Aug 15 06:14:14 2023 +0800

    server : add missing /json-schema-to-grammar.mjs (#2616)

    fixes #2611

commit d783f7982e0e823a2626a9956359c0d36c1a7e21
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Mon Aug 14 21:37:39 2023 +0800

    metal : return null instead of exit(1) (#2573)

commit d75561df207d22790609ee0ad924302f66ac2599
Author: Cheng Shao <terrorjack@type.dance>
Date:   Mon Aug 14 15:36:42 2023 +0200

    server : add --numa support (#2524)

commit 348acf188c9fbe66396990f2dc83229df367969b
Author: Kamil Tomšík <info@tomsik.cz>
Date:   Mon Aug 14 15:35:16 2023 +0200

    llama : add missing enum keyword in function signatures (#2610)

commit 1cd06fa25eb859b14b3427a1d815a48f25fc3c34
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Mon Aug 14 10:41:22 2023 +0200

    CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596)

commit 2feb8934eb75ca63f3c42724229cce1df9579c8e
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Mon Aug 14 16:20:17 2023 +0800

    server : fix default grammar by use empty string in the UI (#2604)

commit 5517d6e69214cdead000a76983b9fe175c3f8329
Author: Jhen-Jie Hong <iainst0409@gmail.com>
Date:   Mon Aug 14 15:16:54 2023 +0800

    server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588)

    * server : implement json-schema-to-grammar.mjs by follow python impl

    * server : add grammar support in chat.mjs

    * server : implement grammer param in the UI

    * server : generate .hpp

    * server : remove trailing whitespaces

    * server : generate .hpp

    * server : fix sort of prop pairs

    * server : optimize regex & iteration

commit f31b5397143009d682db90fd2a6cde83f1ef00eb
Author: vxiiduu <73044267+vxiiduu@users.noreply.github.com>
Date:   Mon Aug 14 13:59:16 2023 +1000

    Enhance Windows 7 and below compatibility. (#2592)

    * Enhance Windows 7 compatibility.
    * Clean away unnecessary preprocessor conditional

commit ee77efea2a1e3f7d153976b0934522b6bbaa62e6
Author: drbh <david.richard.holtz@gmail.com>
Date:   Sun Aug 13 10:00:48 2023 -0400

    test : add simple grammar parsing tests (#2594)

    * adds simple grammar parsing tests

    * adds cassert header

commit f64d44a9b9581cd58f7ec40f4fa1c3ca5ca18e1e
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Sun Aug 13 00:24:45 2023 +0200

    CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590)

commit cd61aa0d9e16627935c7978adf488a679ddfa745
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 12 17:24:31 2023 -0500

    restore main_gpu parameter

commit 4a042f326830271a4c31104051b7b08e08ac234e
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Aug 12 10:51:46 2023 +0300

    gfx1100 support

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: jammm <2500920+jammm@users.noreply.github.com>
    Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit 8913bc6fea97d3cb860937b0461f455c6abe3ea1
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:16:02 2023 +0300

    Allow overriding CC_TURING

commit e77a4c37a756c002e97173f4122e088fb304e18a
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:00:07 2023 +0300

    Merge 'origin/master' into hipblas

commit cc4c4e355cd553b1557d5fba2562e824db93f9b4
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Fri Aug 11 09:43:14 2023 +0300

    New __dp4a assembly

    Now compatible with gfx900 and faster as well.

commit 1a03b709848ce68d5bf5966237756167e2cac540
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 09:30:28 2023 +0300

    Undo mess

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit 4366ff9ba1b1f12e494118ef9b5198479022fcc5
Author: DannyDaemonic <DannyDaemonic@gmail.com>
Date:   Thu Aug 10 13:11:36 2023 -0700

    Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.

commit 811ff855a24323cafddc95c1b8aca711fef05f76
Author: Christian Demsar <crasm@git.vczf.us>
Date:   Thu Aug 10 10:28:27 2023 -0400

    Add --n-predict -2 for stopping generation on full context (#2565)

commit 37c9717aaa6815b6a5be21aaab970212f20fe6bf
Author: Martin Krasser <krasserm@googlemail.com>
Date:   Thu Aug 10 12:16:38 2023 +0200

    Fix grammar-based sampling issue in server (#2566)

commit 9483288e0318a4dcc2e08eb817dfdd09c6552533
Merge: dae9dff b19edd5
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Sat Aug 12 16:04:11 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	Makefile

commit b19edd54d51cef5e3616c18b1d0d8626895b2cba
Author: byte-6174 <88070277+byte-6174@users.noreply.github.com>
Date:   Fri Aug 11 19:17:25 2023 -0400

    Adding support for llama2.c models (#2559)

commit 53dc399472d5bd35ee739b865e843b1996bd3814
Author: Equim <sayaka@ekyu.moe>
Date:   Sat Aug 12 06:35:14 2023 +0800

    server: fixed wrong variable name in timing json (#2579)

    * server: fixed wrong variable name in timing json

    * remove redunct entry

commit dae9dffa6aa53923cfbb09ac5de7e08f34920733
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Fri Aug 11 14:54:27 2023 +0800

    rename koboldcpp.dll to koboldcpp_default.dll

commit 9ca4abed893685692f90413e4d43153af12342d9
Author: DannyDaemonic <DannyDaemonic@gmail.com>
Date:   Thu Aug 10 13:11:36 2023 -0700

    Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.

commit d18ecd5b9e5dde58ae08a3eef1637406159ddaca
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Aug 10 13:19:41 2023 -0500

    make mmq gen faster for amd

commit 243894a952147a4fac5b6aee748861a0df6cc2c6
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:14:40 2023 +0300

    ws fix

commit ac2f14da445ea87d73539adbd29d19ff2c9eba58
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Thu Aug 10 12:11:27 2023 +0300

    AMD assembly optimized __dp4a

    Doesn't seem to work for gfx900, so commented out.

commit 9dba0c985f140ddded8cbb671f139e81fff82eed
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:09:28 2023 +0300

    Fix merge

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>

commit e59fcb2bc129881f4a269fee748fb38bce0a64de
Author: Christian Demsar <crasm@git.vczf.us>
Date:   Thu Aug 10 10:28:27 2023 -0400

    Add --n-predict -2 for stopping generation on full context (#2565)

commit 886f4eed7948f494e3da1d48d4f6f844e2f9a2c2
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 10 22:01:33 2023 +0800

    updated lite, up ver, remove bell

commit 1638757767072a4957f52b9e3594f0b67610631b
Author: Martin Krasser <krasserm@googlemail.com>
Date:   Thu Aug 10 12:16:38 2023 +0200

    Fix grammar-based sampling issue in server (#2566)

commit c5f5209d37b09325377e36f39eab0b0f0c0d006e
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Thu Aug 10 16:30:02 2023 +0800

    globalize args

commit f570b5cb1070591527a82d94bba408927b37778d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 22:11:20 2023 -0500

    Revert "revert cuda changes as they are bugggy"

    This reverts commit 1541bf879772aeeed8ff646bfc52185c2a88b79b.

commit 1541bf879772aeeed8ff646bfc52185c2a88b79b
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 22:36:41 2023 +0800

    revert cuda changes as they are bugggy

commit bacc20203efb1839aa313858a04d75255bb4b7f4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:37:17 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit b7cb4cfd109986bd66e8fd382d1e2516eaddfebb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:00:52 2023 -0500

    additional fixes

commit fadae727baa3735ad3e0667384d6e05ca056b3ef
Merge: 518eb2a 8f8ab6c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:45:50 2023 -0500

    Merge branch 'hipblas' into develop4Main

commit 518eb2af9225f8300a108c4244c7eb0a2217c3bc
Merge: bda0215 cae6a84
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:32:10 2023 -0500

    Merge remote-tracking branch 'upstream/concedo' into develop2Main

commit bda0215b413bafc49890aa23fc35f96a191fb3e0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:17:54 2023 -0500

    update makefile to multisystem path

commit 8f8ab6c4c049df501e9a5ed8fef3aa0fc0691421
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:05:03 2023 -0500

    hipLDFLAG Path change Unix to multisystem in Makefile

    changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS

commit 610ba4cfc460ed65c4adc32d3365a216690384d5
Merge: 4024f91 25d43e0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 23:54:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 916a9acdd0a411426690400ebe2bb7ce840a6bba
Author: Sam Spilsbury <smspillaz@gmail.com>
Date:   Wed Aug 9 23:47:42 2023 +0300

    ggml-alloc: Don't try to re-use buffers of external tensors (#2562)

    * ggml-alloc: Don't try to re-use buffers of external tensors

    They might be weights that came from another context, so we
    have no control over them (and they might be re-used elsewhere
    so writing to them would be a bad idea).

    * ggml-alloc: >= when checking for out-of-bounds

    Co-authored-by: slaren <slarengh@gmail.com>

    ---------

    Co-authored-by: slaren <slarengh@gmail.com>

commit ea04a4ca1940d92becc0ee26523aa2c4a18cf938
Author: grahameth <96447521+grahameth@users.noreply.github.com>
Date:   Wed Aug 9 22:46:40 2023 +0200

    add log_callback to llama_context_params for custom logging. (#2234)

    * add log_callback to llama_context_params for custom logging.

    * Fix macro expansion on gcc

    * Add struct llama_state for global variables and move log_callback there

    * Turn log level into enum and some minor changes.

    * Remove model_for_logging parameter (not needed anymore)

    * Convert remaining fprintf(stderr, ...) calls to use new macros.

    * Fix enum and initialize g_state

    * Fix log calls after merge

    * Fix missing static

    * Add back all the new lines in the logging strings

    * Add comment for llama_log_callback and replace remaining printf calls

    ---------

    Co-authored-by: grahameth <->
    Co-authored-by: Helmut <helmut.buhler@inf.h-brs.de>

commit a07e6dd3ad1a622f08c3187799879d4f1c49bad4
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 22:36:41 2023 +0800

    revert cuda changes as they are bugggy

commit f8376c7e610f68d07e079ff91f6988fb7a8399e2
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 21:23:33 2023 +0800

    up ver, fixed compile (+1 squashed commits)

    Squashed commits:

    [ca51aa9e] up ver

commit ba09f1c807956c59d8c64988626e95459f627ced
Merge: 3a7853d 25d43e0
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 21:18:34 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	README.md
    #	ggml-cuda.cu

commit 3a7853d259c242d4977e9f4dc7627a799d5812b4
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 21:07:57 2023 +0800

    handle stablecode-completion-alpha-3b

commit 25d43e0eb578b6e73046d9d6644a3a14d460600d
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Wed Aug 9 09:42:34 2023 +0200

    CUDA: tuned mul_mat_q kernels (#2546)

commit 90058d96b0c6ab77802e153c23fad66d2f21a438
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 15:28:07 2023 +0800

    sleep longer before exit

commit 19cf2a8663938c424407544c13749f371104517b
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 12:42:59 2023 +0800

    add idle field and up ver

commit 4b8a354895e078d3f0cafdf53430d72d3af8bb99
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 12:25:21 2023 +0800

    cudatoolkit version

commit 159ad9269d95bc07720c79debc23b5c466357b53
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 11:50:12 2023 +0800

    up ver, set the cuda pool malloc lookahead back to 5% instead of 2% (+1 squashed commits)

    Squashed commits:

    [e0f65278] up ver, set the cuda pool malloc lookahead back to 5% instead of 2%

commit 4024f91a665d83b6de8658d45ec9d004c5d90c79
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 01:56:44 2023 +0300

    Add intrinsics polyfills for AMD

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com>

commit ab6212864ce8e9af200bcedb3e0126ee49aa8d0a
Merge: d91456a f5bfea0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 00:37:01 2023 +0300

    Merge 'origin/master' into hipblas

commit 926d90fbabe836d16a5326eb99bdcb89ca0fc042
Merge: 793cfd1 f5bfea0
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 01:09:04 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	Makefile

commit 793cfd136cc721884f79d09036b748e4f176cdb4
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 01:05:00 2023 +0800

    fixed 70B detection again, try fix horde issues, fixed lite unicode issue, fixed cmake for cuda

commit f5bfea0580e417f99850d5456ca541d871a3e48c
Author: Martin Krasser <krasserm@googlemail.com>
Date:   Tue Aug 8 15:29:19 2023 +0200

    Allow passing grammar to completion endpoint (#2532)

    * Allow passing grammar to completion endpoint

commit acfc5478ff3446ca3b54553967a3dea09b7c771a
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Tue Aug 8 14:38:16 2023 +0200

    CUDA: tighter VRAM scratch size for 65b/70b (#2551)

commit 7ed8d1fe7f8cbe6a6763e6b46759795ac8d21e12
Author: chaihahaha <chai836275709@gmail.com>
Date:   Tue Aug 8 20:07:02 2023 +0800

    llm.vim : multiline autocompletion, get rid of "^@" (#2543)

commit e7f94d6fdc83b41ba449b4b8c80821673dd12ffc
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Tue Aug 8 15:05:30 2023 +0300

    vim : bring back simple llm.vim example

commit 2d7baaf50f3277e65cf71071f61ea34823d14c30
Author: AustinMroz <austinmroz@utexas.edu>
Date:   Tue Aug 8 06:44:48 2023 -0500

    vim : streaming and more (#2495)

    * Update Vim plugin

    * Remove getbufoneline usage, Add input bind example.

    getbufoneline() appears to be a recently added function and has been
    replaced with getbufline for compatibility.

    An additional example that explains how to add a keybind that works in
    insert mode was added.

commit f3c3b4b1672d860800639c87d3b5d17564692469
Author: klosax <131523366+klosax@users.noreply.github.com>
Date:   Mon Aug 7 19:07:19 2023 +0200

    Add --rope-scale parameter (#2544)

    * common.cpp : Add --rope-scale parameter
    * README.md : Add info about using linear rope scaling

commit 3554080502cb050ccc3ae11d7a67df866ac3bd07
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Tue Aug 8 00:41:02 2023 +0800

    fixed blasbatchmul multiplier

commit 28ad80b6e4d38dde9e395fc5d4ebf19dc4aa4b66
Merge: 3c7d938 93356bd
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Tue Aug 8 00:34:10 2023 +0800

    Merge branch 'master' into concedo_experimental

commit 3c7d938d95fd51780be37f10cdddb2f26a770adf
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Tue Aug 8 00:32:51 2023 +0800

    update lite, resize scratch buffers for blasbatch 2048

commit 93356bdb7a324a8f6570f99d02af392cd4c45796
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 7 14:25:58 2023 +0300

    ggml : mul mat tweaks (#2372)

    * ggml : mul mat wip

    ggml-ci

    * ggml : alternative thread distribution for mul_mat

    ggml-ci

    * ggml : mul_mat block tiling attempt

    * ggml : mul_mat threads yield

    ggml-ci

commit 60baff7c8584ec369e53469cad5f92e102b1efe4
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 7 14:24:42 2023 +0300

    ggml : pad result of ggml_nbytes()

commit 9082b5dfbfae01243a0b822dcd2812877e63bf1b
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 7 13:55:18 2023 +0300

    ggml : change params pointer (style change) (#2539)

    ggml-ci

commit 99d29c0094476c4962023036ecd61a3309d0e16b
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 7 13:20:09 2023 +0300

    ggml : sync (custom ops) (#2537)

    ggml-ci

commit 9133e456d2d52b05c6c7f92cd94a0d2564ddb2f7
Merge: cae6a84 3d9a551
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 17:33:42 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	Makefile
    #	build.zig

commit cae6a847ada88e415b0beda09d70d79b51762618
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 16:40:13 2023 +0800

    cuda free only for non mmq (+2 squashed commit)

    Squashed commit:

    [3aca763a] only cuda free for non mmq

    [e69a8c9f] revert to pool alloc to try again

commit 3d9a55181603e85a26378a850a14068034e5002d
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Mon Aug 7 10:09:40 2023 +0200

    Fixed mmap prefetch for GPU offloading (#2529)

commit f6f9896ac3d2ff207e18f87dab85d126ceef5236
Author: Georgi Gerganov <ggerganov@gmail.com>
Date:   Mon Aug 7 10:52:57 2023 +0300

    metal : fix out-of-bounds access + inc concurrency nodes (#2416)

    * metal : fix out-of-bounds access + style changes

    * metal : increase concurrency nodes to 2*GGML_MAX_NODES

commit 9f16a4c4efc5cca845e027c1dbad615612b9248c
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 15:16:37 2023 +0800

    switch to upstream implementation of pool malloc

commit 34a14b28ff7f3c98730339bacee035091b2a812a
Author: GiviMAD <GiviMAD@users.noreply.github.com>
Date:   Sun Aug 6 23:21:46 2023 -0700

    [Makefile] Move ARM CFLAGS before compilation (#2536)

commit 7297128db8159c7b12db4c28a4532b993025c2e5
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Aug 7 08:35:53 2023 +0300

    [Zig] Rewrite build for Zig 0.11 (#2514)

    * zig build fixes

    * Disable LTO on Windows.

commit 6659652c9fd1853dcb2d1882efc8f14b159d5d43
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 11:05:06 2023 +0800

    lower actual temp used when temp=0

commit 0e41b94f40e1d10893d6ac29c727482573ef1652
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 10:43:06 2023 +0800

    improve detection for 70B.

commit fb44d72a78a81790d238ffd2453cf66d02eed688
Merge: 559c0e2 d9024df
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 10:17:43 2023 +0800

    Merge remote-tracking branch 'johannes/cuda-fix-mmap-prefetch' into concedo_experimental

commit 559c0e2d1f621402d410944b5291da647243ab33
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Mon Aug 7 10:15:20 2023 +0800

    updated lite again, fix for wi

commit d9024df759b25d030fc8266d399c565fe7be9a04
Author: JohannesGaessler <johannesg@5d6.de>
Date:   Sun Aug 6 10:18:05 2023 +0200

    Fixed mmap prefetch for GPU offloading

commit d442888626f11335e0c9e3b8555d2429b3262580
Merge: 198cc82 86c3219
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Sun Aug 6 22:47:33 2023 +0800

    Merge branch 'master' into concedo_experimental

    # Conflicts:
    #	Makefile

commit 198cc826fcb9…
YellowRoseCx added a commit to YellowRoseCx/koboldcpp-rocm that referenced this pull request Aug 25, 2023
commit 3416c98
Merge: 5eb17f0 4c4e435
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:46:56 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 5eb17f0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:38:21 2023 -0500

    ROCm Port update

    * use hipblas based on cublas
    * Update Makefile for the Cuda kernels
    * Expand arch list and make it overrideable
    * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)
    * add hipBLAS to README
    * new build arg LLAMA_CUDA_MMQ_Y
    * fix half2 decomposition
    * Add intrinsics polyfills for AMD
    * AMD assembly optimized __dp4a
    * Allow overriding CC_TURING
    * use "ROCm" instead of "CUDA"
    * ignore all build dirs
    * Add Dockerfiles
    * fix llama-bench
    * fix -nommq help for non CUDA/HIP

    ---------

    Co-Authored-By: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
    Co-Authored-By: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-Authored-By: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-Authored-By: Engininja2 <139037756+Engininja2@users.noreply.github.com>
    Co-Authored-By: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
    Co-Authored-By: jammm <2500920+jammm@users.noreply.github.com>
    Co-Authored-By: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit b34f4bd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 19 17:12:52 2023 -0500

    Update README.md

commit 7d11961
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Aug 14 23:03:12 2023 -0500

    remove force DMMV

commit cd61aa0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 12 17:24:31 2023 -0500

    restore main_gpu parameter

commit 4a042f3
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Aug 12 10:51:46 2023 +0300

    gfx1100 support

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: jammm <2500920+jammm@users.noreply.github.com>
    Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit 8913bc6
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:16:02 2023 +0300

    Allow overriding CC_TURING

commit e77a4c3
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:00:07 2023 +0300

    Merge 'origin/master' into hipblas

commit cc4c4e3
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Fri Aug 11 09:43:14 2023 +0300

    New __dp4a assembly

    Now compatible with gfx900 and faster as well.

commit 1a03b70
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 09:30:28 2023 +0300

    Undo mess

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit 4366ff9
Author: DannyDaemonic <DannyDaemonic@gmail.com>
Date:   Thu Aug 10 13:11:36 2023 -0700

    Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.

commit 811ff85
Author: Christian Demsar <crasm@git.vczf.us>
Date:   Thu Aug 10 10:28:27 2023 -0400

    Add --n-predict -2 for stopping generation on full context (ggerganov#2565)

commit 37c9717
Author: Martin Krasser <krasserm@googlemail.com>
Date:   Thu Aug 10 12:16:38 2023 +0200

    Fix grammar-based sampling issue in server (ggerganov#2566)

commit d18ecd5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Aug 10 13:19:41 2023 -0500

    make mmq gen faster for amd

commit 243894a
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:14:40 2023 +0300

    ws fix

commit ac2f14d
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Thu Aug 10 12:11:27 2023 +0300

    AMD assembly optimized __dp4a

    Doesn't seem to work for gfx900, so commented out.

commit 9dba0c9
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:09:28 2023 +0300

    Fix merge

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>

commit f570b5c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 22:11:20 2023 -0500

    Revert "revert cuda changes as they are bugggy"

    This reverts commit 1541bf8.

commit 1541bf8
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 22:36:41 2023 +0800

    revert cuda changes as they are bugggy

commit bacc202
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:37:17 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit b7cb4cf
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:00:52 2023 -0500

    additional fixes

commit fadae72
Merge: 518eb2a 8f8ab6c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:45:50 2023 -0500

    Merge branch 'hipblas' into develop4Main

commit 518eb2a
Merge: bda0215 cae6a84
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:32:10 2023 -0500

    Merge remote-tracking branch 'upstream/concedo' into develop2Main

commit bda0215
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:17:54 2023 -0500

    update makefile to multisystem path

commit 8f8ab6c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:05:03 2023 -0500

    hipLDFLAG Path change Unix to multisystem in Makefile

    changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS

commit 610ba4c
Merge: 4024f91 25d43e0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 23:54:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 4024f91
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 01:56:44 2023 +0300

    Add intrinsics polyfills for AMD

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com>

commit ab62128
Merge: d91456a f5bfea0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 00:37:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ee9fa2a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 2 01:53:58 2023 -0500

    Update Makefile

commit d91456a
Author: ardfork <134447697+ardfork@users.noreply.github.com>
Date:   Mon Jul 31 20:35:00 2023 +0300

    fix half2 decomposition

commit c1cb70d
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 31 19:56:44 2023 +0300

    new build arg LLAMA_CUDA_MMQ_Y

commit c1664a0
Merge: 4336231 0728c5a
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 31 19:32:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 848558d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 30 20:02:52 2023 -0500

    import vars logic fix

commit b650b84
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 30 00:21:36 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 8573a67
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 21:31:12 2023 -0500

    remove duplicate code and fix typo

    remove duplicate tooltip

commit 430986e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 21:07:34 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available
    " if len(runopts)==6 else + "

commit dd0db72
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 20:52:31 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available

commit 43fffb6
Merge: 0ed65a4 b40550c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 19:13:15 2023 -0500

    Merge branch 'concedo'

commit 0ed65a4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 18:34:21 2023 -0500

    Hide unavailable backends & Add tooltip over backend count

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

    Add tooltip when hovering over backend count label

    hovering over the new label that shows the backend count will explain what the numbers are, and show the users which backends are not available or built

commit 2a26398
Merge: cee2e9d 31486eb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 15:16:33 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 4336231
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 18:35:56 2023 +0300

    add hipBLAS to README

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit f8e3fc6
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 14:16:46 2023 +0300

    rocblas init stuff

commit d2ade63
Merge: cde52d6 8a88e58
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 12:59:48 2023 +0300

    Merge 'origin/master' into hipblas

commit cee2e9d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 26 23:36:55 2023 -0500

    Only Show Available Backends in GUI

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

commit 7863610
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 26 13:27:22 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 731cd6e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jul 25 22:39:50 2023 -0500

    Create easy_rocm_install.sh

commit f154685
Merge: cbdc1f3 94e0a06
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jul 25 22:25:10 2023 -0500

    Merge branch 'concedo_experimentalMAIN'

commit cbdc1f3
Merge: 5b838d4 9731682
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 16:53:21 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cde52d6
Merge: 8e8054a 84e09a7
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 24 12:22:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 8e8054a
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 24 12:20:49 2023 +0300

    Add rocblas to build files

commit 1f6294d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:52:01 2023 -0500

    Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)

    * initialize rocblas

commit 5b838d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:10:35 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 9bfb2fd
Merge: b379f9d 66328fc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:07:44 2023 -0500

    Merge branch 'concedo_experimental'

commit b379f9d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:07:00 2023 -0500

    Revert "amd multigpu full layer offload w/o vram scratch"

    This reverts commit 9adfc8e.

commit 9adfc8e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 02:56:40 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 05c792e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 00:18:48 2023 -0500

    initialize rocblas

commit ade68d0
Merge: 521ad6b 56995ca
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 23 20:25:05 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 521ad6b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jul 20 21:42:33 2023 -0500

    lazy import_var error handling for saves

commit 9553e52
Merge: cac6650 f036109
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jul 20 19:59:41 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cac6650
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 17 23:05:02 2023 -0500

    Makefile fix! Allows hip/clblast build together

commit 3db70b5
Merge: 2ec4466 7568d1a
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 18 01:54:17 2023 +0300

    Merge 'origin/master' into hipblas

commit f208670
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 14 02:56:03 2023 -0500

    improve error handling with gpu names

commit 860e738
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 14 00:33:03 2023 -0500

    Show GPU names in GUI, Only show GPUs that exist

    changed the pre-set 1,2,3 and 1,2,3,all settings that the GPU selector had and replaced them with a function that grabs the GPU names and sets the names as the values for the selector boxes.

commit 2ec4466
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jul 13 13:44:02 2023 +0300

    Update build flags.

    GGML_CUDA_DMMV_Y is now GGML_CUDA_MMV_Y
    so update your build instructions.

    GGML_CUDA_FORCE_DMMV is always enabled.

    ---------

    Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>

commit cd36b18
Merge: afcb8fe 1cbf561
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jul 13 13:03:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ac7ebc3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 18:32:18 2023 -0500

    add hipBLAS name scheme to GUI and update README

commit 7f85cc5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 17:35:54 2023 -0500

    update makefile and ggml.c

commit 6ca3499
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 15:43:45 2023 -0500

    ggml.c fix

commit 770e674
Merge: 2b289cd 5941514
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 15:24:36 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2b289cd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:30:00 2023 -0500

    Update c-cpp.yml

commit 5dae95a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:28:51 2023 -0500

    Update c-cpp.yml

commit b37cd73
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:27:04 2023 -0500

    Create c-cpp.yml to test Actions

commit afcb8fe
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 18:09:27 2023 +0300

    Add new config option

commit 8c2c497
Merge: e610466 2347463
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 17:53:54 2023 +0300

    Merge 'origin/master' into hipblas

commit e610466
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 17:53:14 2023 +0300

    Expand arch list and make it overrideable

commit 80e4e54
Merge: 7735c5a 1d16309
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 10 02:09:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 8432e9d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 9 16:55:30 2023 -0500

    Update Makefile

commit b58c189
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 9 16:20:00 2023 -0500

    Add multi-gpu CuBLAS support to new GUI

commit 0c1c71b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 8 07:56:57 2023 -0500

    Update Makefile

commit f864f60
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Sat Jul 8 00:25:15 2023 +0200

    CUDA: add __restrict__ to mul mat vec kernels (ggerganov#2140)

commit 4539bc2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 8 01:36:14 2023 -0500

    update makefile for changes

commit 912e31e
Merge: 74e2703 ddaa4f2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 7 23:15:37 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 74e2703
Merge: cf65429 f9108ba
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 5 15:16:49 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit 7735c5a
Merge: c3e3733 7ee76e4
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 4 17:09:16 2023 +0300

    Merge 'origin/master' into hipblas

commit cf65429
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 16:56:40 2023 -0500

    print cuda or opencl based on what's used

commit 72c16d2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 16:45:39 2023 -0500

    Revert "fix my mistake that broke other arches"

    This reverts commit 777aed5.

commit 777aed5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 15:53:32 2023 -0500

    fix my mistake that broke other arches

commit 27780a9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 16:03:27 2023 -0500

    rocm fixes

commit f52c7d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 16:02:58 2023 -0500

    Revert "rocm fixes"

    This reverts commit 2fe9927.

commit 2fe9927
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:58:21 2023 -0500

    rocm fixes

commit efe7560
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:55:43 2023 -0500

    Revert "move HIPBLAS definitions into ggml-cuda.h"

    This reverts commit bf49a93.

commit 4fc0181
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:55:36 2023 -0500

    Revert "move hipblas definitions to header files"

    This reverts commit 2741ffb.

commit 89eb576
Merge: 2741ffb 3d2907d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 14:44:13 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c3e3733
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jul 2 15:51:31 2023 +0300

    ROCm fixes

commit 15db19a
Merge: 04419f1 46088f7
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jul 2 15:39:57 2023 +0300

    Merge 'origin/master' into hipblas

commit 2741ffb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 17:07:42 2023 -0500

    move hipblas definitions to header files

commit bf49a93
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 16:38:50 2023 -0500

    move HIPBLAS definitions into ggml-cuda.h

commit 540f4e0
Merge: 2c3b46f eda663f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 14:58:32 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2c3b46f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 18:43:43 2023 -0500

    changes to fix build

commit c9e1103
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 18:20:07 2023 -0500

    Update ggml_v2-cuda-legacy.cu for ROCM

commit b858fc5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 17:49:39 2023 -0500

    changes to work with upstream

commit 69a0c25
Merge: 096f0b0 1347d3a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 16:59:06 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 04419f1
Merge: bb16eff d3494bb
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Jun 28 23:30:10 2023 +0300

    Merge 'origin/master' into hipblas

commit bb16eff
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 28 15:27:10 2023 -0500

    headers fix; add kquants_iter for hipblas and add gfx803 (#1)

    * kquants_iter for hipblas and add gfx803
    * Update CMakeLists.txt with hipblas kquants_iter and DMMV_F16
    * remove dmmv_f16 for now

commit 096f0b0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 28 15:27:02 2023 -0500

    revert unnecessary hipblas conditionals

commit d81e81a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 28 14:48:23 2023 -0500

    Update Makefile hipblas nvcc correction

commit c8ae945
Merge: c1e5c83 0be54f7
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 27 10:50:37 2023 +0300

    Merge 'origin/master' into hipblas

commit 2579ecf
Merge: abed427 d2034ce
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 25 17:50:04 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c1e5c83
Merge: 35a6031 447ccbe
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jun 25 21:40:05 2023 +0300

    Merge 'origin/master' into hipblas

commit 35a6031
Merge: df7346c 66a2555
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jun 25 10:57:48 2023 +0300

    Merge 'origin/master' into hipblas

commit abed427
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jun 24 19:16:30 2023 -0500

    reorganize If statements to include proper headers

commit 06c3bf0
Merge: ea6d320 8342fe8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jun 24 16:57:20 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit ea6d320
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jun 23 01:53:28 2023 -0500

    Update README.md

commit 4d56ad8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 22 16:19:43 2023 -0500

    Update README.md

commit 21f9308
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 22 15:42:05 2023 -0500

    kquants_iter for hipblas and add gfx803

commit df7346c
Merge: 5dd2fbe 7487137
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jun 22 20:51:09 2023 +0300

    Merge 'origin/master' into hipblas

commit b6ff890
Merge: eb094f0 e6ddb15
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 22 12:42:09 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit eb094f0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 21 23:59:18 2023 -0500

    lowvram parameter description

commit 3a5dfeb
Merge: 665cc11 b1f00fa
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 21 16:53:03 2023 -0500

    Merge branch 'LostRuins:concedo' into koboldcpp-rocm

commit 665cc11
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 21 01:13:19 2023 -0500

    add lowvram parameter

commit 222cbbb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 20 19:03:28 2023 -0500

    add additional hipblas conditions for cublas

commit e1f9581
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 20 16:51:59 2023 -0500

    Add hip def for cuda v2

commit 3bff5c0
Merge: a7e74b3 266d47a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 20 13:38:06 2023 -0500

    Merge branch 'LostRuins:concedo' into koboldcpp-rocm

commit a7e74b3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jun 19 22:04:18 2023 -0500

    Update README.md

commit 5e99b3c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jun 19 22:03:42 2023 -0500

    Update Makefile

commit 9190b17
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jun 19 21:47:10 2023 -0500

    Update README.md

commit 5dd2fbe
Merge: 67e229b 20568fe
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 20 01:23:12 2023 +0300

    Merge 'origin/master' into hipblas

commit 2780ea2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 15:48:00 2023 -0500

    Update Makefile

commit 04a3e64
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:33:39 2023 -0500

    remove extra line

commit cccbca9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:31:17 2023 -0500

    attempt adding ROCM hipblas

commit a44a1d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:31:01 2023 -0500

    attempt adding ROCM hipblas

commit b088184
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:30:54 2023 -0500

    attempt adding ROCM hipblas

commit 67e229b
Merge: 6f7c156 b241649
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jun 18 00:36:54 2023 +0300

    Merge 'origin/master' into hipblas

commit 6f7c156
Merge: 61df8e9 fc45a81
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jun 17 16:53:22 2023 +0300

    Merge 'origin/master' into hipblas

commit 61df8e9
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Jun 14 22:46:10 2023 +0300

    add cudaMemset

commit a836529
Merge: 85f902d 254a7a7
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Jun 14 22:41:55 2023 +0300

    Merge 'origin/master' into hipblas

commit 85f902d
Merge: 4362e80 b50b570
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jun 8 10:50:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 4362e80
Merge: fa5b3d7 17366df
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 23:14:40 2023 +0300

    Merge 'origin/master' into hipblas

commit fa5b3d7
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 18:47:00 2023 +0300

    fix makefile.

commit 1ba4ce4
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 18:41:08 2023 +0300

    Revert "warp size fixes"

    It seems like 32 is faster for me, at least and it won't cause so many conflicts.

    This reverts commit 5d6eb72.

commit 5d6eb72
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 18:32:41 2023 +0300

    warp size fixes

commit 33091a9
Merge: 9fdaa1d 2d43387
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 16:19:23 2023 +0300

    Merge  'origin/master' into hipblas

commit 9fdaa1d
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 27 19:17:53 2023 +0300

    Add more defs

    For forward compatibility ggerganov#1607

commit a4648c1
Merge: 4c8b3fb 0ecb1bb
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 27 18:22:39 2023 +0300

    Merge 'origin/master' into hipblas

commit 4c8b3fb
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 01:08:53 2023 +0300

    add configurable vars

commit 30d921a
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 01:03:56 2023 +0300

    and makefile

commit a593a4f
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 00:55:28 2023 +0300

    Add missing parameters

commit 174bf6a
Merge: f80ce7a 1fcdcc2
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 00:44:23 2023 +0300

    Merge 'origin/master' into hipblas

commit f80ce7a
Merge: 600ace3 ac7876a
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 25 00:02:50 2023 +0300

    Merge branch 'origin/master' into hipblas

commit 600ace3
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 20 23:42:20 2023 +0300

    update warp size

commit b19fefe
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 20 23:28:08 2023 +0300

    Forwardcompat

commit c66115b
Merge: a0b2d5f b8ee340
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 20 18:29:31 2023 +0300

    Merge 'origin/master' into hipblas

commit a0b2d5f
Merge: 8bab456 2a5ee02
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue May 16 17:08:29 2023 +0300

    Merge 'origin/master' into hipblas

commit 8bab456
Merge: 2956630 b5c9295
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon May 15 00:01:12 2023 +0300

    Merge 'origin/master' into hipblas

commit 2956630
Merge: 0fe6384 f048af0
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 13 13:12:52 2023 +0300

    Merge 'origin/master' into hipblas

commit 0fe6384
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 12 17:22:11 2023 +0300

    fix makefile

commit 605560d
Merge: 127f68e 089b1c9
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 12 16:12:53 2023 +0300

    Merge 'origin/master' into hipblas

commit 127f68e
Merge: 070cbcc b608b55
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 11 20:21:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 070cbcc
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 18:10:56 2023 +0300

    occupanct function

commit a3296d5
Merge: 0aefa6a e129551
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 18:06:04 2023 +0300

    Merge 'origin/master' into hipblas

commit 0aefa6a
Merge: baeb482 1b0fd45
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 12:24:41 2023 +0300

    Merge 'origin/master' into hipblas

commit baeb482
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 12:24:12 2023 +0300

    Revert to default copy

commit 289073a
Merge: 1107194 173d0e6
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 6 19:59:41 2023 +0300

    Merge 'origin/master' into hipblas

commit 1107194
Merge: 04c0d48 a3b85b2
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 6 00:38:20 2023 +0300

    Merge 'origin/master' into hipblas

commit 04c0d48
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 4 12:31:16 2023 +0300

    Move all HIP stuff to ggml-cuda.cu

commit d83cfba
Merge: b67cc50 799fdc1
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 4 11:31:16 2023 +0300

    Merge 'origin/master' into hipblas

commit b67cc50
Merge: fcbc262 e216aa0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed May 3 15:04:51 2023 +0300

    Merge 'origin/master' into hipblas

commit fcbc262
Merge: c73def1 f4cef87
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon May 1 22:45:29 2023 +0300

    Merge 'origin/master' into hipblas

commit c73def1
Merge: d8ea75e f0d70f1
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Apr 30 18:40:42 2023 +0300

    Merge 'origin/master' into hipblas

commit d8ea75e
Merge: d194586 334637e
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Apr 29 11:25:51 2023 +0300

    Merge 'origin/master' into hipblas

commit d194586
Merge: 2ab9d11 7f15c5c
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 23:03:52 2023 +0300

    Merge 'origin/master' into hipblas

commit 2ab9d11
Merge: 3b4a531 04aaae1
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 16:30:05 2023 +0300

    Merge 'origin/master' into hipblas

commit 3b4a531
Merge: a1caa48 0b2da20
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 10:08:41 2023 +0300

    Merge 'origin/master' into hipblas

commit a1caa48
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 10:08:21 2023 +0300

    add more cuda defines

    This is so 'slaren/cuda-f16f32' would merge.

commit ecc0565
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 01:58:27 2023 +0300

    only .cu file needs to be complied as device

commit ef51e9e
Merge: d571d16 4afcc37
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Apr 26 12:46:26 2023 +0300

    Merge branch 'ggerganov:master' into hipblas

commit d571d16
Merge: 608aa33 dd0eabc
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Apr 25 21:15:33 2023 +0300

    Merge 'origin/master' into hipblas

commit 608aa33
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Apr 25 21:15:04 2023 +0300

    change default GPU arch to match CMake

commit 3a004b2
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Apr 24 02:24:54 2023 +0300

    add rpath

commit db7a012
Merge: 3677235 284685f
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Apr 23 21:49:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 3677235
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Apr 22 23:28:00 2023 +0300

    More build file changes

commit d3e1984
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 21 03:32:06 2023 +0300

    add rpath

commit 0e005f7
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 21 02:13:00 2023 +0300

    Build file changes

    Now HIP Clang is not required, the CMake scripts will configure the
    needed compiler, which can be system clang++. Also other code can
    still use GCC, but CMake will force the clang to link.

commit 54a63c1
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Apr 20 22:19:22 2023 +0300

    Update Makefile for the Cuda kernels

commit 0fd8363
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Apr 20 02:04:00 2023 +0300

    use hipblas based on cublas
LostRuins added a commit to LostRuins/koboldcpp that referenced this pull request Aug 28, 2023
* koboldcpp-ROCm Port

commit 3416c98
Merge: 5eb17f0 4c4e435
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:46:56 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 5eb17f0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:38:21 2023 -0500

    ROCm Port update

    * use hipblas based on cublas
    * Update Makefile for the Cuda kernels
    * Expand arch list and make it overrideable
    * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)
    * add hipBLAS to README
    * new build arg LLAMA_CUDA_MMQ_Y
    * fix half2 decomposition
    * Add intrinsics polyfills for AMD
    * AMD assembly optimized __dp4a
    * Allow overriding CC_TURING
    * use "ROCm" instead of "CUDA"
    * ignore all build dirs
    * Add Dockerfiles
    * fix llama-bench
    * fix -nommq help for non CUDA/HIP

    ---------

    Co-Authored-By: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
    Co-Authored-By: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-Authored-By: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-Authored-By: Engininja2 <139037756+Engininja2@users.noreply.github.com>
    Co-Authored-By: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
    Co-Authored-By: jammm <2500920+jammm@users.noreply.github.com>
    Co-Authored-By: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit b34f4bd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 19 17:12:52 2023 -0500

    Update README.md

commit 7d11961
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Aug 14 23:03:12 2023 -0500

    remove force DMMV

commit cd61aa0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 12 17:24:31 2023 -0500

    restore main_gpu parameter

commit 4a042f3
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Aug 12 10:51:46 2023 +0300

    gfx1100 support

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: jammm <2500920+jammm@users.noreply.github.com>
    Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit 8913bc6
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:16:02 2023 +0300

    Allow overriding CC_TURING

commit e77a4c3
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:00:07 2023 +0300

    Merge 'origin/master' into hipblas

commit cc4c4e3
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Fri Aug 11 09:43:14 2023 +0300

    New __dp4a assembly

    Now compatible with gfx900 and faster as well.

commit 1a03b70
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 09:30:28 2023 +0300

    Undo mess

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit 4366ff9
Author: DannyDaemonic <DannyDaemonic@gmail.com>
Date:   Thu Aug 10 13:11:36 2023 -0700

    Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.

commit 811ff85
Author: Christian Demsar <crasm@git.vczf.us>
Date:   Thu Aug 10 10:28:27 2023 -0400

    Add --n-predict -2 for stopping generation on full context (ggerganov#2565)

commit 37c9717
Author: Martin Krasser <krasserm@googlemail.com>
Date:   Thu Aug 10 12:16:38 2023 +0200

    Fix grammar-based sampling issue in server (ggerganov#2566)

commit d18ecd5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Aug 10 13:19:41 2023 -0500

    make mmq gen faster for amd

commit 243894a
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:14:40 2023 +0300

    ws fix

commit ac2f14d
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Thu Aug 10 12:11:27 2023 +0300

    AMD assembly optimized __dp4a

    Doesn't seem to work for gfx900, so commented out.

commit 9dba0c9
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:09:28 2023 +0300

    Fix merge

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>

commit f570b5c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 22:11:20 2023 -0500

    Revert "revert cuda changes as they are bugggy"

    This reverts commit 1541bf8.

commit 1541bf8
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 22:36:41 2023 +0800

    revert cuda changes as they are bugggy

commit bacc202
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:37:17 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit b7cb4cf
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:00:52 2023 -0500

    additional fixes

commit fadae72
Merge: 518eb2a 8f8ab6c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:45:50 2023 -0500

    Merge branch 'hipblas' into develop4Main

commit 518eb2a
Merge: bda0215 cae6a84
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:32:10 2023 -0500

    Merge remote-tracking branch 'upstream/concedo' into develop2Main

commit bda0215
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:17:54 2023 -0500

    update makefile to multisystem path

commit 8f8ab6c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:05:03 2023 -0500

    hipLDFLAG Path change Unix to multisystem in Makefile

    changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS

commit 610ba4c
Merge: 4024f91 25d43e0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 23:54:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 4024f91
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 01:56:44 2023 +0300

    Add intrinsics polyfills for AMD

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com>

commit ab62128
Merge: d91456a f5bfea0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 00:37:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ee9fa2a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 2 01:53:58 2023 -0500

    Update Makefile

commit d91456a
Author: ardfork <134447697+ardfork@users.noreply.github.com>
Date:   Mon Jul 31 20:35:00 2023 +0300

    fix half2 decomposition

commit c1cb70d
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 31 19:56:44 2023 +0300

    new build arg LLAMA_CUDA_MMQ_Y

commit c1664a0
Merge: 4336231 0728c5a
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 31 19:32:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 848558d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 30 20:02:52 2023 -0500

    import vars logic fix

commit b650b84
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 30 00:21:36 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 8573a67
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 21:31:12 2023 -0500

    remove duplicate code and fix typo

    remove duplicate tooltip

commit 430986e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 21:07:34 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available
    " if len(runopts)==6 else + "

commit dd0db72
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 20:52:31 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available

commit 43fffb6
Merge: 0ed65a4 b40550c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 19:13:15 2023 -0500

    Merge branch 'concedo'

commit 0ed65a4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 18:34:21 2023 -0500

    Hide unavailable backends & Add tooltip over backend count

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

    Add tooltip when hovering over backend count label

    hovering over the new label that shows the backend count will explain what the numbers are, and show the users which backends are not available or built

commit 2a26398
Merge: cee2e9d 31486eb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 15:16:33 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 4336231
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 18:35:56 2023 +0300

    add hipBLAS to README

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit f8e3fc6
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 14:16:46 2023 +0300

    rocblas init stuff

commit d2ade63
Merge: cde52d6 8a88e58
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 12:59:48 2023 +0300

    Merge 'origin/master' into hipblas

commit cee2e9d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 26 23:36:55 2023 -0500

    Only Show Available Backends in GUI

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

commit 7863610
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 26 13:27:22 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 731cd6e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jul 25 22:39:50 2023 -0500

    Create easy_rocm_install.sh

commit f154685
Merge: cbdc1f3 94e0a06
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jul 25 22:25:10 2023 -0500

    Merge branch 'concedo_experimentalMAIN'

commit cbdc1f3
Merge: 5b838d4 9731682
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 16:53:21 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cde52d6
Merge: 8e8054a 84e09a7
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 24 12:22:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 8e8054a
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 24 12:20:49 2023 +0300

    Add rocblas to build files

commit 1f6294d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:52:01 2023 -0500

    Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)

    * initialize rocblas

commit 5b838d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:10:35 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 9bfb2fd
Merge: b379f9d 66328fc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:07:44 2023 -0500

    Merge branch 'concedo_experimental'

commit b379f9d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:07:00 2023 -0500

    Revert "amd multigpu full layer offload w/o vram scratch"

    This reverts commit 9adfc8e.

commit 9adfc8e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 02:56:40 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 05c792e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 00:18:48 2023 -0500

    initialize rocblas

commit ade68d0
Merge: 521ad6b 56995ca
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 23 20:25:05 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 521ad6b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jul 20 21:42:33 2023 -0500

    lazy import_var error handling for saves

commit 9553e52
Merge: cac6650 f036109
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jul 20 19:59:41 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cac6650
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 17 23:05:02 2023 -0500

    Makefile fix! Allows hip/clblast build together

commit 3db70b5
Merge: 2ec4466 7568d1a
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 18 01:54:17 2023 +0300

    Merge 'origin/master' into hipblas

commit f208670
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 14 02:56:03 2023 -0500

    improve error handling with gpu names

commit 860e738
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 14 00:33:03 2023 -0500

    Show GPU names in GUI, Only show GPUs that exist

    changed the pre-set 1,2,3 and 1,2,3,all settings that the GPU selector had and replaced them with a function that grabs the GPU names and sets the names as the values for the selector boxes.

commit 2ec4466
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jul 13 13:44:02 2023 +0300

    Update build flags.

    GGML_CUDA_DMMV_Y is now GGML_CUDA_MMV_Y
    so update your build instructions.

    GGML_CUDA_FORCE_DMMV is always enabled.

    ---------

    Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>

commit cd36b18
Merge: afcb8fe 1cbf561
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jul 13 13:03:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ac7ebc3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 18:32:18 2023 -0500

    add hipBLAS name scheme to GUI and update README

commit 7f85cc5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 17:35:54 2023 -0500

    update makefile and ggml.c

commit 6ca3499
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 15:43:45 2023 -0500

    ggml.c fix

commit 770e674
Merge: 2b289cd 5941514
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 15:24:36 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2b289cd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:30:00 2023 -0500

    Update c-cpp.yml

commit 5dae95a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:28:51 2023 -0500

    Update c-cpp.yml

commit b37cd73
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:27:04 2023 -0500

    Create c-cpp.yml to test Actions

commit afcb8fe
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 18:09:27 2023 +0300

    Add new config option

commit 8c2c497
Merge: e610466 2347463
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 17:53:54 2023 +0300

    Merge 'origin/master' into hipblas

commit e610466
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 17:53:14 2023 +0300

    Expand arch list and make it overrideable

commit 80e4e54
Merge: 7735c5a 1d16309
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 10 02:09:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 8432e9d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 9 16:55:30 2023 -0500

    Update Makefile

commit b58c189
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 9 16:20:00 2023 -0500

    Add multi-gpu CuBLAS support to new GUI

commit 0c1c71b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 8 07:56:57 2023 -0500

    Update Makefile

commit f864f60
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Sat Jul 8 00:25:15 2023 +0200

    CUDA: add __restrict__ to mul mat vec kernels (ggerganov#2140)

commit 4539bc2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 8 01:36:14 2023 -0500

    update makefile for changes

commit 912e31e
Merge: 74e2703 ddaa4f2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 7 23:15:37 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 74e2703
Merge: cf65429 f9108ba
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 5 15:16:49 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit 7735c5a
Merge: c3e3733 7ee76e4
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 4 17:09:16 2023 +0300

    Merge 'origin/master' into hipblas

commit cf65429
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 16:56:40 2023 -0500

    print cuda or opencl based on what's used

commit 72c16d2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 16:45:39 2023 -0500

    Revert "fix my mistake that broke other arches"

    This reverts commit 777aed5.

commit 777aed5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 15:53:32 2023 -0500

    fix my mistake that broke other arches

commit 27780a9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 16:03:27 2023 -0500

    rocm fixes

commit f52c7d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 16:02:58 2023 -0500

    Revert "rocm fixes"

    This reverts commit 2fe9927.

commit 2fe9927
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:58:21 2023 -0500

    rocm fixes

commit efe7560
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:55:43 2023 -0500

    Revert "move HIPBLAS definitions into ggml-cuda.h"

    This reverts commit bf49a93.

commit 4fc0181
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:55:36 2023 -0500

    Revert "move hipblas definitions to header files"

    This reverts commit 2741ffb.

commit 89eb576
Merge: 2741ffb 3d2907d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 14:44:13 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c3e3733
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jul 2 15:51:31 2023 +0300

    ROCm fixes

commit 15db19a
Merge: 04419f1 46088f7
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jul 2 15:39:57 2023 +0300

    Merge 'origin/master' into hipblas

commit 2741ffb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 17:07:42 2023 -0500

    move hipblas definitions to header files

commit bf49a93
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 16:38:50 2023 -0500

    move HIPBLAS definitions into ggml-cuda.h

commit 540f4e0
Merge: 2c3b46f eda663f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 14:58:32 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2c3b46f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 18:43:43 2023 -0500

    changes to fix build

commit c9e1103
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 18:20:07 2023 -0500

    Update ggml_v2-cuda-legacy.cu for ROCM

commit b858fc5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 17:49:39 2023 -0500

    changes to work with upstream

commit 69a0c25
Merge: 096f0b0 1347d3a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 16:59:06 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 04419f1
Merge: bb16eff d3494bb
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Jun 28 23:30:10 2023 +0300

    Merge 'origin/master' into hipblas

commit bb16eff
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 28 15:27:10 2023 -0500

    headers fix; add kquants_iter for hipblas and add gfx803 (#1)

    * kquants_iter for hipblas and add gfx803
    * Update CMakeLists.txt with hipblas kquants_iter and DMMV_F16
    * remove dmmv_f16 for now

commit 096f0b0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 28 15:27:02 2023 -0500

    revert unnecessary hipblas conditionals

commit d81e81a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 28 14:48:23 2023 -0500

    Update Makefile hipblas nvcc correction

commit c8ae945
Merge: c1e5c83 0be54f7
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 27 10:50:37 2023 +0300

    Merge 'origin/master' into hipblas

commit 2579ecf
Merge: abed427 d2034ce
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 25 17:50:04 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c1e5c83
Merge: 35a6031 447ccbe
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jun 25 21:40:05 2023 +0300

    Merge 'origin/master' into hipblas

commit 35a6031
Merge: df7346c 66a2555
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jun 25 10:57:48 2023 +0300

    Merge 'origin/master' into hipblas

commit abed427
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jun 24 19:16:30 2023 -0500

    reorganize If statements to include proper headers

commit 06c3bf0
Merge: ea6d320 8342fe8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jun 24 16:57:20 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit ea6d320
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jun 23 01:53:28 2023 -0500

    Update README.md

commit 4d56ad8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 22 16:19:43 2023 -0500

    Update README.md

commit 21f9308
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 22 15:42:05 2023 -0500

    kquants_iter for hipblas and add gfx803

commit df7346c
Merge: 5dd2fbe 7487137
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jun 22 20:51:09 2023 +0300

    Merge 'origin/master' into hipblas

commit b6ff890
Merge: eb094f0 e6ddb15
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 22 12:42:09 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit eb094f0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 21 23:59:18 2023 -0500

    lowvram parameter description

commit 3a5dfeb
Merge: 665cc11 b1f00fa
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 21 16:53:03 2023 -0500

    Merge branch 'LostRuins:concedo' into koboldcpp-rocm

commit 665cc11
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 21 01:13:19 2023 -0500

    add lowvram parameter

commit 222cbbb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 20 19:03:28 2023 -0500

    add additional hipblas conditions for cublas

commit e1f9581
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 20 16:51:59 2023 -0500

    Add hip def for cuda v2

commit 3bff5c0
Merge: a7e74b3 266d47a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 20 13:38:06 2023 -0500

    Merge branch 'LostRuins:concedo' into koboldcpp-rocm

commit a7e74b3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jun 19 22:04:18 2023 -0500

    Update README.md

commit 5e99b3c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jun 19 22:03:42 2023 -0500

    Update Makefile

commit 9190b17
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jun 19 21:47:10 2023 -0500

    Update README.md

commit 5dd2fbe
Merge: 67e229b 20568fe
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 20 01:23:12 2023 +0300

    Merge 'origin/master' into hipblas

commit 2780ea2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 15:48:00 2023 -0500

    Update Makefile

commit 04a3e64
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:33:39 2023 -0500

    remove extra line

commit cccbca9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:31:17 2023 -0500

    attempt adding ROCM hipblas

commit a44a1d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:31:01 2023 -0500

    attempt adding ROCM hipblas

commit b088184
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jun 18 14:30:54 2023 -0500

    attempt adding ROCM hipblas

commit 67e229b
Merge: 6f7c156 b241649
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jun 18 00:36:54 2023 +0300

    Merge 'origin/master' into hipblas

commit 6f7c156
Merge: 61df8e9 fc45a81
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jun 17 16:53:22 2023 +0300

    Merge 'origin/master' into hipblas

commit 61df8e9
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Jun 14 22:46:10 2023 +0300

    add cudaMemset

commit a836529
Merge: 85f902d 254a7a7
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Jun 14 22:41:55 2023 +0300

    Merge 'origin/master' into hipblas

commit 85f902d
Merge: 4362e80 b50b570
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jun 8 10:50:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 4362e80
Merge: fa5b3d7 17366df
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 23:14:40 2023 +0300

    Merge 'origin/master' into hipblas

commit fa5b3d7
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 18:47:00 2023 +0300

    fix makefile.

commit 1ba4ce4
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 18:41:08 2023 +0300

    Revert "warp size fixes"

    It seems like 32 is faster for me, at least and it won't cause so many conflicts.

    This reverts commit 5d6eb72.

commit 5d6eb72
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 18:32:41 2023 +0300

    warp size fixes

commit 33091a9
Merge: 9fdaa1d 2d43387
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jun 6 16:19:23 2023 +0300

    Merge  'origin/master' into hipblas

commit 9fdaa1d
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 27 19:17:53 2023 +0300

    Add more defs

    For forward compatibility ggerganov#1607

commit a4648c1
Merge: 4c8b3fb 0ecb1bb
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 27 18:22:39 2023 +0300

    Merge 'origin/master' into hipblas

commit 4c8b3fb
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 01:08:53 2023 +0300

    add configurable vars

commit 30d921a
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 01:03:56 2023 +0300

    and makefile

commit a593a4f
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 00:55:28 2023 +0300

    Add missing parameters

commit 174bf6a
Merge: f80ce7a 1fcdcc2
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 26 00:44:23 2023 +0300

    Merge 'origin/master' into hipblas

commit f80ce7a
Merge: 600ace3 ac7876a
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 25 00:02:50 2023 +0300

    Merge branch 'origin/master' into hipblas

commit 600ace3
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 20 23:42:20 2023 +0300

    update warp size

commit b19fefe
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 20 23:28:08 2023 +0300

    Forwardcompat

commit c66115b
Merge: a0b2d5f b8ee340
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 20 18:29:31 2023 +0300

    Merge 'origin/master' into hipblas

commit a0b2d5f
Merge: 8bab456 2a5ee02
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue May 16 17:08:29 2023 +0300

    Merge 'origin/master' into hipblas

commit 8bab456
Merge: 2956630 b5c9295
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon May 15 00:01:12 2023 +0300

    Merge 'origin/master' into hipblas

commit 2956630
Merge: 0fe6384 f048af0
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 13 13:12:52 2023 +0300

    Merge 'origin/master' into hipblas

commit 0fe6384
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 12 17:22:11 2023 +0300

    fix makefile

commit 605560d
Merge: 127f68e 089b1c9
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri May 12 16:12:53 2023 +0300

    Merge 'origin/master' into hipblas

commit 127f68e
Merge: 070cbcc b608b55
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 11 20:21:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 070cbcc
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 18:10:56 2023 +0300

    occupanct function

commit a3296d5
Merge: 0aefa6a e129551
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 18:06:04 2023 +0300

    Merge 'origin/master' into hipblas

commit 0aefa6a
Merge: baeb482 1b0fd45
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 12:24:41 2023 +0300

    Merge 'origin/master' into hipblas

commit baeb482
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun May 7 12:24:12 2023 +0300

    Revert to default copy

commit 289073a
Merge: 1107194 173d0e6
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 6 19:59:41 2023 +0300

    Merge 'origin/master' into hipblas

commit 1107194
Merge: 04c0d48 a3b85b2
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat May 6 00:38:20 2023 +0300

    Merge 'origin/master' into hipblas

commit 04c0d48
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 4 12:31:16 2023 +0300

    Move all HIP stuff to ggml-cuda.cu

commit d83cfba
Merge: b67cc50 799fdc1
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu May 4 11:31:16 2023 +0300

    Merge 'origin/master' into hipblas

commit b67cc50
Merge: fcbc262 e216aa0
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed May 3 15:04:51 2023 +0300

    Merge 'origin/master' into hipblas

commit fcbc262
Merge: c73def1 f4cef87
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon May 1 22:45:29 2023 +0300

    Merge 'origin/master' into hipblas

commit c73def1
Merge: d8ea75e f0d70f1
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Apr 30 18:40:42 2023 +0300

    Merge 'origin/master' into hipblas

commit d8ea75e
Merge: d194586 334637e
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Apr 29 11:25:51 2023 +0300

    Merge 'origin/master' into hipblas

commit d194586
Merge: 2ab9d11 7f15c5c
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 23:03:52 2023 +0300

    Merge 'origin/master' into hipblas

commit 2ab9d11
Merge: 3b4a531 04aaae1
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 16:30:05 2023 +0300

    Merge 'origin/master' into hipblas

commit 3b4a531
Merge: a1caa48 0b2da20
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 10:08:41 2023 +0300

    Merge 'origin/master' into hipblas

commit a1caa48
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 10:08:21 2023 +0300

    add more cuda defines

    This is so 'slaren/cuda-f16f32' would merge.

commit ecc0565
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 28 01:58:27 2023 +0300

    only .cu file needs to be complied as device

commit ef51e9e
Merge: d571d16 4afcc37
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Apr 26 12:46:26 2023 +0300

    Merge branch 'ggerganov:master' into hipblas

commit d571d16
Merge: 608aa33 dd0eabc
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Apr 25 21:15:33 2023 +0300

    Merge 'origin/master' into hipblas

commit 608aa33
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Apr 25 21:15:04 2023 +0300

    change default GPU arch to match CMake

commit 3a004b2
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Apr 24 02:24:54 2023 +0300

    add rpath

commit db7a012
Merge: 3677235 284685f
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Apr 23 21:49:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 3677235
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Apr 22 23:28:00 2023 +0300

    More build file changes

commit d3e1984
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 21 03:32:06 2023 +0300

    add rpath

commit 0e005f7
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Apr 21 02:13:00 2023 +0300

    Build file changes

    Now HIP Clang is not required, the CMake scripts will configure the
    needed compiler, which can be system clang++. Also other code can
    still use GCC, but CMake will force the clang to link.

commit 54a63c1
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Apr 20 22:19:22 2023 +0300

    Update Makefile for the Cuda kernels

commit 0fd8363
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Apr 20 02:04:00 2023 +0300

    use hipblas based on cublas

* Merge Fixes

* readme merge fix

* remove old ggmlv2 changes

* bring ggml v2_cuda up to date with AMD changes

* Revert ggml v2_cuda changes BC they werent needed

This reverts commit 3385dd4.

* avoid launching subprocesses to get device names for now, but other than that seems to be working

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Jun 14, 2024
commit 4aa091e52dcecb018f1336a655832b9cf637e62a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 5 12:16:57 2024 -0500

    Delete .github/workflows/Pathtester.yml

commit c36d4cd85625368e7838350f51d2f65b069faa46
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 5 12:14:12 2024 -0500

    Update Pathtester.yml

commit ac44718cf7be322b11930e7a60d9b7d36ff53ba6
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 5 12:11:25 2024 -0500

    Update Pathtester.yml

commit 06ba55eefa05d7586a74316e6bbd8b2c803d1e0e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 5 11:47:20 2024 -0500

    Update Pathtester.yml

commit 85220b560f611fd2579be74dbf7a5cc68cdb65ff
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jun 5 11:46:45 2024 -0500

    Create Pathtester.yml

commit b5f2ee7c9536d97e804fa94b90153d92400f2e93
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 4 22:53:15 2024 -0500

    Update CMakeLists.txt amd clang

commit 70684c78e1eb8f0f9b4458b11c2ab0bbbd07afc8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 4 21:24:01 2024 -0500

    Update CMakeLists.txt

commit 7ab57963342e1ddfdba61cb682888aa13e7fe8bc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 4 16:07:24 2024 -0500

    Update CMakeLists.txt switch to all llvm clang

commit e67e66e38367042381437123ef3cc516bf26ed79
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 4 15:41:47 2024 -0500

    Update CMakeLists.txt

commit fbfe65a81d9881bf44323cc5316167112008c167
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 4 12:58:37 2024 -0500

    Update makefile and cmake-rocm-windows.yml

commit b232f2ddf0f05280ce21ec2ab03e2844e14c4265
Merge: feecb41d 57894178
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jun 4 12:58:02 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit feecb41dfa9a4f46e467fa9e3106040cacd6868e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 14:49:07 2024 -0500

    Update make_pyinstaller_exe_rocm_only.bat

commit 5fe26236fe347b3399fc70844a32b6bfafd09b17
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 14:48:43 2024 -0500

    Update make_pyinst_rocm_hybrid_henk_yellow.bat

commit 3bd2af2093330438c36a34964cc3b0dbdd82e68f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 14:48:07 2024 -0500

    Update README.md

commit 36883c8e043837cfb8ecef84439ee0526ebb7c8d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 14:35:19 2024 -0500

    Update README.md

commit 698bec72b070d4db74f31821792aeafe222656ff
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 14:32:06 2024 -0500

    Update README.md compilation guide

commit d2f5e1f19f321a8ed6116e32660f773e354a3d75
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 14:12:49 2024 -0500

    use tar not 7z in make_pyinstaller_exe_rocm_only.bat

commit 7636580b887de991254cf52432a327ad3baae5aa
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun May 26 13:49:04 2024 -0500

    Update make_pyinstaller_exe_rocm_only.bat

commit 98477a16e5085b6fd57905b8d75e7b89a9684a1b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat May 25 16:49:33 2024 -0500

    windows is fixed??

commit 798f516b565c50aa910d8c6bf1161610907d212f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri May 24 21:10:48 2024 -0500

    temp fix for windows

commit fc1958970832f9b6c1f8bdcbfb300828d1a16d10
Merge: 40aa1823 65305013
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri May 24 21:04:55 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 40aa18232968d285c231b4ea3a03f211d78b325e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 16 13:22:35 2024 -0500

    invalid escape

commit 60af2da0a656a991c3d83c4af095b4aa2b889175
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 16 13:02:07 2024 -0500

    more cmake changes

commit d23982b520ddda321ac6f0512916f68e942cd407
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 16 12:42:35 2024 -0500

    Update cmake-rocm-windows.yml

commit 1a39f0c95f1b761642b89f0bf2c63c5be40ca7f2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 16 12:39:45 2024 -0500

    cmake try rocm build with clang/clang++ only

commit 206461c8f92705d0c39f911d5c78d68b64ee2bff
Merge: c729a982 702be65e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 16 12:08:30 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit c729a982d43e47c5905ab60db68db2e8d91027f9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 16 12:04:28 2024 -0500

    cmake changes to try and fix windows

commit cd0e18afe0c83bdfaf0fdc0ab2fb7ed9952d38b2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 9 10:04:01 2024 -0500

    Update CMakeLists.txt xhip

commit e2b42c89c5d64499f2236575560ba15d0a954943
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 9 09:52:04 2024 -0500

    Update cmake-rocm-windows.yml

commit ad2780320a5d83e6a563f5826c764063a36205c0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu May 9 09:51:12 2024 -0500

    set mcode object version to 4

commit cfa4cddbbd9a3f4f5e8da5df299b1436e78481d4
Merge: 9adf5c07 a3718c63
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon May 6 01:37:17 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 9adf5c074b7ea885361aeaffddd222fe46168aed
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed May 1 19:58:01 2024 -0500

    Update cmake-rocm-windows.yml

commit 97844b0003cb75b09e0af55c25de6342c449b70b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed May 1 18:28:34 2024 -0500

    Update CMakeLists.txt

commit d8742eb18c0a370d460a34c3aa061deb21b56359
Merge: 4a3626ee 3c2bd8aa
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed May 1 10:04:34 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 4a3626eebf36565e93b3d5f79ebbcfdcbb815d97
Merge: 0766d6f1 81619f36
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed May 1 09:28:24 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 0766d6f1069f48de722c1e4bbf915bd027897fde
Merge: 90b1ff19 b641d986
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed May 1 09:04:42 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 90b1ff199655ba0eed8257007a1e80fbdad52402
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Apr 23 20:57:24 2024 -0500

    Update deprecated upload-artifact to v4 cmake-rocm-windows.yml

commit c039f577f583787eebb11006a9b954f1b8aa1492
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Apr 23 20:45:45 2024 -0500

    Update dependency in cmake-rocm-windows.yml

commit f123ad3f234ffbe77c75495add2e84936b07bbe7
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Apr 23 19:06:01 2024 -0500

    Update cmake-rocm-windows.yml

commit eb896c1d9ccb96c33c851cd6789d15993884d03a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Apr 23 18:59:35 2024 -0500

    Update cmake-rocm-windows.yml

commit 23f663f7b63691c5c591f7b315e3706ddba611ef
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Apr 23 18:52:55 2024 -0500

    Update cmakelist.txt

    Update CMakeLists.txt

commit 9d5a947f46dcdb6c87181c0fe72dee5a1ff10233
Merge: a622e1a1 593f08bb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Apr 22 01:44:15 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit a622e1a1771ad1b6936d27ff2b7cfc19f404e89f
Merge: d8895744 41fa4310
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Apr 11 10:05:11 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit d8895744e9ef6cabb73dd170622df61e1c45f0ad
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Apr 11 09:44:47 2024 -0500

    Delete .github/workflows/make-vulkan-windows.yml

commit 32911a21f3616664906c807e8b60b8744252d1e1
Merge: 24e9a4f4 bf320dca
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Apr 10 10:26:22 2024 -0500

    update to 1.62.2

commit 24e9a4f4f049d0851953f075965595e6660a7840
Merge: 9c1707d6 df596aee
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Apr 9 23:22:59 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 9c1707d6d1d6115750d203d0854eaf2dc01cbf32
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Mar 20 11:51:43 2024 -0500

    set pyinstaller version to 6.4.0

commit 801b5b5c05f682c9940a9dd16500bc24c7be207c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Mar 17 11:29:24 2024 -0500

    revert makefile change

    revert makefile  because of a breaking change

commit 98f9388463be67361e2f1e59a343673fe9874cce
Merge: ba3f5e37 f3b76511
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Mar 15 07:07:49 2024 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit ba3f5e37a55a60b0e696c45f22f82ac2ef37af34
Merge: 893a1c8b f44df0e2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Mar 6 14:48:33 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 893a1c8b3d58566bd4ad7c290aec9ff6a9aed361
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Mar 5 07:58:04 2024 -0600

    Update easy_KCPP-ROCm_install.sh exec iss.

commit a6e7e7b6215b68b5812891843cfc0918b0be2bd4
Merge: 5acb4e24 39ae58ef
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Feb 26 01:42:17 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 5acb4e246fd882d036dc975055f1bca000f4a6fc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 14:52:06 2024 -0600

    update build file to add vulkan

commit df1f77bd3f94c9034ecba719b336ebc5d045cb04
Merge: be7a9ff3 7eaf572b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 14:50:35 2024 -0600

    Merge branch 'main' of https://github.com/YellowRoseCx/koboldcpp-rocm

commit 7eaf572bb9802bd1c5c60fd937ad3dd3b282d940
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 13:21:17 2024 -0600

    Update make-vulkan-windows.yml

commit e2d122f840a463c369e6784bc520e5c2600502be
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 13:17:57 2024 -0600

    Create make-vulkan-windows.yml

commit be7a9ff33bb441342044e929b3ab04aad34e821c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 13:04:23 2024 -0600

    add additional tooltips

commit ce74a1f841a7fdd2cb8385eb8cfa7d162a263af2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 13:03:54 2024 -0600

    move update checkbox

commit 27355421ebf6fa20b0fe1e1996c82f03327b2551
Merge: b9860f7e 60212827
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Feb 25 12:33:59 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit b9860f7e1b3589459a4f15f3d3a4759631dcb3e1
Merge: ae6ece1d 1e460bb9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Feb 17 22:51:56 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit ae6ece1da9b35162fd2d4af1c74bacb8f90822a4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Feb 10 20:38:57 2024 -0600

    correct the build procedure

commit 3051a154ba702740de5adc862dc0fd65e243ad83
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Feb 10 13:26:21 2024 -0600

    Implement Building for GFX1031/GFX1032 on Windows

    Experimental support for GFX1031 and GFX1032 architectures on ROCm for Windows

commit 744d57022201bef31d161cfc21d608634a2aa77d
Merge: d0d4c80f d1aff0e9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Feb 10 13:24:12 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit d0d4c80fe9b107f04dc6ab4454f6508f053c060d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jan 30 23:24:12 2024 -0600

    gfx1031 file extraction was in wrong place

commit 1eae8ca031f4761f762e7a7ab949735a6b63ec92
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jan 30 18:10:24 2024 -0600

    Update cmake-rocm-windows.yml for 6700XT

commit e0a3aa34b6dbf0ac6f28b1b10c693780bc1241b0
Merge: 02053756 dc7ca93f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jan 27 22:47:12 2024 -0600

    Merge pull request #30 from one-lithe-rune/windows_cmake_hipsdk_5_7_1

    Update Github build actions/cmake command lines for windows HIP SDK 5.7.1

commit 020537567b1faa0efbee1f387bad69537df04b01
Merge: 449f48e9 61ca3a0d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jan 27 18:04:02 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 449f48e98a8dcecc30d46dfa2fb585e297ccdfc8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jan 23 03:15:41 2024 -0600

    print error on exception, only set gfx override if 1 gpu

commit dc7ca93f11be126fa9acd3611bb1aa690e3d7d03
Author: one-lithe-rune <skapusniak@lithe-runes.com>
Date:   Fri Jan 19 09:36:12 2024 +0000

    Update tests-rocm-windows & readme for HIP SDK 5.7.1

    * Update README with cmake commandline on windows for HIP SDK 5.7.1
    * Update Gibhub action tests-rocm-windows.yml for HIP SDK 5.7.1

commit 13868bf674edfe6ecbc9147bddd887fc65f67b8c
Author: one-lithe-rune <skapusniak@lithe-runes.com>
Date:   Fri Jan 19 09:27:58 2024 +0000

    Update cmake-rocm-windows to set HIP_PLATFORM

    * Explicitly set HIP_PLATFORM=amd in the cmake defines, to avoid the code
    path in the Hip SDK 5.7.1 where it tries to use the broken hipconfig.bat supplied
    with the SDK.

commit 0481293488bf31bad89ebd02e868bd7315bce634
Author: one-lithe-rune <skapusniak@lithe-runes.com>
Date:   Thu Jan 18 23:43:30 2024 +0000

    Update cmake-rocm-windows workflow for HIP sdk 5.7.1

    * Bump the rocm/hip version for the windows build to 5.7.1

commit 61765f6b006d9fedcffac12f5e543ffd417f761a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jan 10 18:12:19 2024 -0600

    Update README.md to make it better organized

commit cdb2b733ebe7a66633d0217a944d1ff59724ecb2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jan 10 17:14:04 2024 -0600

    better way to find rocminfo

commit f976b841b7ee8d6276e5bc99da1fe5a3ee9220fe
Merge: 433a3ce6 d2bf4798
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jan 10 14:47:12 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 433a3ce645cbdfab69ab7f1c67421d3c786a15fb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jan 9 22:24:35 2024 -0600

    Auto set HSA_OVERRIDE_GFX=10.3.0 for RX6000 series

    Automatically set HSA_OVERRIDE_GFX_VERSION=10.3.0 for all AMD RX 6000 series GPUs

commit 236ac0144ae1c2fad8df5eb1223fbbff751d8aec
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jan 1 21:35:53 2024 -0600

    glob is broken with backslashes

commit f984d4ad4fad448a5c03eaf0e921206aae6f3b46
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jan 1 20:49:56 2024 -0600

    Update cmake-rocm-windows.yml

commit c96cc5ce5eb1f0be73da3875b75fdabe0b5224ee
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jan 1 20:09:46 2024 -0600

    Update glob pattern cmake-rocm-windows.yml

commit 2e41f6690cce77498d67c363f522b5dbb563968b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jan 1 18:58:58 2024 -0600

    Create zip of kcpp_rocm files for release

commit 4f7345234e361b203fabd195cb42e08965474783
Merge: e2221526 9e0dee76
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jan 1 17:07:32 2024 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit e2221526fb20c7893a525c29b00050b4d4c350c6
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jan 1 14:08:20 2024 -0600

    Create zip of kcpp_rocm files for release

commit b85d59eb23338a49de0e2e3a1fee0338aa7f41bf
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 23 03:14:56 2023 -0600

    set TLS min/max versions with SSL

commit 76eb691f5a866da460d972dc7e5cf151d5c8ebbe
Merge: 031c60b0 71a5afaa
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 23 02:21:52 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 031c60b0cb73d5d6fd7bc1190ef6b4ba854fa3af
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Dec 19 01:40:24 2023 -0600

    checkforupdates check YR patch level

commit 47dbaa87e28fd9c71eff50f619dcbc5525665f8f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Dec 19 00:12:34 2023 -0600

    hipBLAS auto-selection logic fix

    .kcpps hipblas option tweak
    only check for CUDevicesNames[0]!="" if CuBLAS
    since sometimes/most of the time AMD GPU info is fetched with OpenCL, requiring ``CUDevicesNames[0]!=""`` prevents hipBLAS from being selected. Only check ``CUDevicesNames[0]!=""`` if Use CuBLAS in runopts
    keep elif statement

commit f6c74d245d8c8d8fe186a954d2b4acc44a428302
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Dec 18 23:16:47 2023 -0600

    windows specific text color handling

commit fc47a2712c2fddfcd61a46d403df9885ed02bb7b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Dec 18 22:36:48 2023 -0600

    Add --checkforupdates argument, fix saves

    If enabled, the argument --checkforupdates will fetch the Kobold-ROCm release page one time on start up via HTTPS and compare the latest version number with the current version number and notify the user if a new version is available.
    A GUI button is shown on the Network tab.
    Fixed a bug where importing a save file wouldn't choose the hipBLAS option

commit 509ad00985c82327ad1ab630a24319dceb8a4f2b
Merge: eee005e4 ec052307
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Dec 17 19:13:15 2023 -0600

    Update to 1.52.2 with 'upstream/concedo'

commit eee005e451de3c2e547ff318fe8303b3037d0e80
Merge: a2dcd330 77985879
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Dec 14 22:24:42 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit a2dcd3306750982cb8be3b87b01642e42f80cd03
Merge: f13295b5 30675479
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Dec 13 14:58:04 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit f13295b5454b38aa39b70d7b11c15bf2fb1fb842
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Dec 10 22:43:02 2023 -0600

    CLBLAST_noavx2 should only be built on Windows

commit a2b441ae95e1fc4462883c1967e6b91726df62b1
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Dec 10 04:59:58 2023 -0600

    Fix GUI backend count label

commit 947e75c24ea9b43817f73478463878e5827f6bab
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Dec 10 01:50:31 2023 -0600

    change default GPUs and add ALL_AMD_GPU flag

    Removed workstation GPUs being built by default. Added a flag to build all AMD ROCm GPUs, use it like "make LLAMA_HIPBLAS=1 ALL_AMD_GPU=1"

commit 56c1218d935e1f488ec9d6ef9caa061bc5995264
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Dec 10 01:49:56 2023 -0600

    Linux kcpp-ROCm pyinstaller script

    Linux pyinstaller script to compile a single file executable of KoboldCpp-ROCm. File Size ends up to be about 1.1GB because it adds in all ROCm GPU kernels and library files. (could possibly be improved)

commit c39c3577d3593350d616186cf567bc3b6da55acd
Merge: 260296f1 0ca814e5
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Dec 5 16:09:52 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 260296f1fe553860752a1545d96b2720361af9a3
Merge: 89963499 c142c563
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 2 15:14:33 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 8996349943b80b8de5237613d13fb2a6f0ffe3b6
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 2 15:10:31 2023 -0600

    change upload file name in cmake-rocm-windows.yml!

commit d0c58d6d0cc31f079c70c80b37ad96752102c114
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 2 03:41:59 2023 -0600

    Update version number in cmake-rocm-windows.yml

commit fbb283a86960ff3437aef0214251bd7fd5dc3027
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 2 03:39:25 2023 -0600

    update build action to include clblast, openblas, and default

commit 787531d7602f9a61b84be08266b619dc7ef96605
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 2 03:37:10 2023 -0600

    force hipblas as first option

commit 5be1aa6e14eb9a6d8acdffc963019167379fecbe
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Dec 2 02:51:54 2023 -0600

    amd gpu info fetch cleanup and fix

commit 461bd6219367459ee14b90873e9dc72a961cf33a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Dec 1 21:24:12 2023 -0600

    setup test workflow to build CLBLAST and OpenBLAS

commit f6584d56c831769f93ebce6836ac230b41a39e6c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Dec 1 21:20:53 2023 -0600

    unexpose gcc variables again

commit e8cfbca4e505534bf60cd32a6eb8ae91f432718e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Dec 1 19:55:57 2023 -0600

    expose gcc compiler

commit 0ea110c88be50029fd3f8b29de13e699a1fbd9fd
Merge: b5172fce 495bb3ab
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Dec 1 19:48:48 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit b5172fce6b12735ca83f1a21b3d4aeb189d897ae
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Nov 28 08:54:21 2023 -0600

    Update tests-rocm-windows.yml

commit 9944f6dfd7f0ba74eaa27fdab920d8ff0ee1cdb7
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 26 15:49:23 2023 -0600

    Update tests-rocm-windows.yml

commit 2aae081c803c81e8b47842e68b45cb561d7c10d3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 26 14:00:32 2023 -0600

    Update tests-rocm-windows.yml

commit 6cd180d0de361747464f6ceb84c29b00f4902203
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 26 13:40:08 2023 -0600

    Update tests-rocm-windows.yml w/ cleaner code

commit 3a4f4b1722c0e647e441e0fefcfdf2e40e421098
Merge: b8ad62ff 56a5fa7a
Author: FamousM1 <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 26 03:53:06 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit b8ad62ffd5d175bab559f451649e20e58f5052ad
Merge: 0e02ba47 dbd15239
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 26 03:49:29 2023 -0600

    Merge pull request #28 from one-lithe-rune/amd_device_info_from_vulkan

    Use `vulkaninfo` for AMD gpu heuristics on Windows

commit 0e02ba473d43b6b3c3cc1446dec9cfaedd4c3426
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 26 03:42:54 2023 -0600

    Update make_pyinstaller_exe_rocm_only.bat

commit ec7157f2edb041fda474d048263d5ec2dda2b65b
Merge: 5f5451ea bb531e63
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Nov 25 07:47:44 2023 -0600

    Merge pull request #29 from one-lithe-rune/add_packaging_requirement

    Fix `Warning, GUI failed to start: Reason: No module named 'packaging'` after making new venv

commit 5f5451ea5b84313c06f94be99f50453fe72d231a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 19 14:40:29 2023 -0600

    A few autoheuristics fixes

commit bb531e635a62683097eff55619c8f5144a6d0fe0
Author: one-lithe-rune <skapusniak@lithe-runes.com>
Date:   Sun Nov 19 14:30:48 2023 +0000

    Fix 'Warning GUI failed to start' making new venv

    * Adds `packaging` to requirements.txt to clear the above error
    after creating a new venv and `pip install -r requirements.txt`

commit dbd15239582fadac68d6ba9cd9f090b1140373f3
Author: one-lithe-rune <skapusniak@lithe-runes.com>
Date:   Sun Nov 19 13:45:37 2023 +0000

    Use `vulkaninfo` for AMD gpu heuristics on Windows

    * Change exception paths for gpu heuristics function to return a
    two part tuple like the non-exception paths, rather than a three
    part one with `False`on the end. This prevents the caller from
    blowing up with an extra exception.
    * Added a path to get the AMD devices via vulkaninfo if we are on
    windows as we will not have have rocminfo there even if the rocm
    SDK is installed (windows rocm SDK :shrug:), and we don't
    require the windows SDK anyway, so we can't use hipinfo either.
    * The new code path, uses the existing AMD device/memory list to get
    the memory info, and only returns devices that are on that list.

commit 8080c01780608f4779297bacb75dd56e6da3e8c5
Merge: 11aa5960 22c56f92
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Nov 18 03:06:54 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 11aa5960ae22e75f5af3604546e22d0cea57df19
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 12 20:02:41 2023 -0600

    remove nvidia print exception

commit fe137cbcab0f47dbc8f4abe01023c99b89890219
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 12 19:51:35 2023 -0600

    sep AMDGPU info func,+autoset_layer support in Win

    Moved the code that gets AMD GPU name and VRAM information to it's own function and added a hardcoded VRAM list for Windows supported AMD GPUs in case to add auto-layering support for Windows

commit 04afeb5e2ceba2dee3fed57261ad44fd8257bb7f
Merge: 1c690a2d a00a32e0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 12 18:49:58 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 1c690a2d86c250e5ee4d6415d6d6f2afa05abc07
Merge: a913ad98 21029421
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Nov 9 21:43:10 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit a913ad98c668c97c6a873c8870bf71f75613d9ca
Merge: 46fa8454 93c4b2a9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Nov 6 00:49:56 2023 -0600

    Merge remote-tracking branch 'upstream/concedo'

commit 46fa8454ee0be99378341d83367188cd9a288e0d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Nov 6 00:20:53 2023 -0600

    big Update CMakeLists.txt

commit 27cb2ebb728d8ae12a8cdf83becd952cb93ac1e2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 5 14:19:30 2023 -0600

    Update tests-rocm-windows.yml

commit 10dc997ad3aea3afe487afca6f0c37618511bcce
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 5 02:58:33 2023 -0600

    update version

commit de6fc03f3005d769f398de6e37f544a80a4a4383
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Nov 5 01:11:57 2023 -0600

    adapt GUI changes and fetch AMD memory

commit e3cfe6bee0a106050c13fa15abfcf380eaa036c7
Merge: 06afb74b faae84ee
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Nov 4 22:01:18 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 06afb74b723f6b492134f85481d8c583e42e05d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Nov 4 21:47:42 2023 -0500

    makefile updates

    separate hipBLAS compiler, update MMV_Y value, move the section that prints CXX and CC compiler name

commit acec91dd529ade581b92b069e05f224c6bb87a5e
Merge: 82675366 5f1f8a5a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Oct 22 18:31:27 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 826753665082b3adcb0b5e3f155932c285fd7b10
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Oct 21 02:17:34 2023 -0500

    Update cmake-rocm-windows.yml

    rreemove non working 6700xt code

commit 39365067ccda10434120de6fb4090397f01804e2
Merge: 12055438 dd1d61ea
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Oct 21 01:47:07 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 12055438c37ea6dc471b015148d66ccfef977516
Merge: b01a449c 6e34d31c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Oct 17 17:14:37 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit b01a449c21018e58c549cf81f7cadef7381d1d5d
Author: one-lithe-rune <skapusniak@lithe-runes.com>
Date:   Sat Oct 7 12:48:26 2023 +0100

    Merge pull request #27 from one-lithe-rune/allow-sdk-dll-loading - Allow use of hip SDK (if installed) dlls on windows

    * If the rocm/hip sdk is installed on windows, then include the sdk
    as a potential location to load the hipBlas/rocBlas .dlls from. This
    allows running koboldcpp.py directly with python after building
    work on windows without having to build the .exe and run that or
    copy .dlls around.

commit 5934e9f29ca59f368d952d99fd101ec65e2fa69e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Oct 9 00:30:03 2023 -0500

    remove test code for PR Merge

commit 7eb8294969b952bdb694589f294d80479bc245f0
Merge: 04f5b6cf 80e53af2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Oct 8 16:59:07 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 04f5b6cf26f851b30bc0c048265976da3ce7bc41
Merge: 778253e9 9d0dd7ab
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Oct 6 20:47:03 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 778253e9876b31b76526ffec616f6c1b3e5497d4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Oct 4 20:08:32 2023 -0500

    Update README.md exe link

commit 482e507a1f342f0a09b9c67467185bd5368fc841
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Oct 4 20:06:03 2023 -0500

    Update README.md for ROCm exe and build info

commit 0c8518aa36beda12c6225a63c0dfe2d9a5fc1e6d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Oct 3 22:12:49 2023 -0500

    fix for folder misspelling

commit 9206be98454f4d882d07fc6dadcb80c28a904807
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Oct 3 21:10:14 2023 -0500

    Update tests-rocm-windows.yml

commit e0f292b5a166365228184c44d806d514be8d8807
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Oct 3 20:55:50 2023 -0500

    Rename tests-rocm-windows to tests-rocm-windows.yml

commit acfa1b16e54d46ddf8f14c66673fe290fd1ebd45
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Oct 3 20:54:51 2023 -0500

    Create tests-rocm-windows

commit 1e10c24c58ffdfa583faa36a9dd5dbd8f208cef2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Oct 3 00:41:16 2023 -0500

    Update cmake-rocm-windows.yml

commit 424525b5a5a2ffb3c62663104e2757da7d042ef6
Merge: 1c4e4a6e 23b9d3af
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Oct 2 20:07:00 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 1c4e4a6e3d0b5f7f41d4a33fd0669b847dd2c137
Merge: 354fa72c bc841ec3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Oct 1 02:48:50 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 354fa72c549ad4ce34baa3bde527a31877453e2d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Sep 29 10:57:37 2023 -0500

    Psutil workflow add

commit 16271e027d65539261a352083a05e7cfe77d523c
Merge: 6823315f bfc696fc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Sep 26 21:21:48 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 6823315f5ee53e55dc3062ee1671e03ca6e1f745
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 04:46:12 2023 -0500

    Update cmake-rocm-windows.yml

commit ff5abe60ff291e8363aa4165bc338d274e240e58
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 04:30:45 2023 -0500

    Delete .github/workflows/c-cpp.yml

commit 11fc774d4a96bbed3be0f7573e800a8b2dc623fb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 04:28:44 2023 -0500

    Update cmake-rocm-windows.yml

commit 9fdbb35eb289e39cf63ff93e9c2957e57046e3a0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 03:40:05 2023 -0500

    Update code-coverage.yml

commit c9386757402a4192025e326186ff1811fb17e146
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 03:37:57 2023 -0500

    Update cmake-rocm-windows.yml

commit e63896058b85372e5301d032895f9cd748137ca1
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 02:47:18 2023 -0500

    Update CMakeLists.txt

commit b067ef7c8ccba2f3d16c3ed01b9973d475d5c8b1
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 02:32:15 2023 -0500

    Update cmake-rocm-windows.yml

commit 7df43a1e9af6a807dd7bd0ebf4bc95cc90749784
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Sep 25 02:02:45 2023 -0500

    Update cmake-rocm-windows.yml

commit da825d8b41031879f6271c8a2b194e10320d78ed
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Sep 24 22:57:27 2023 -0500

    Update cmake-rocm-windows.yml

commit 887814d9c0af3271e79da31a934a712b26106757
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Sep 24 21:12:12 2023 -0500

    Update cmake-rocm-windows.yml

commit 491b8626720e14c91ebee871f955e6aba1751d5c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Sep 24 20:37:03 2023 -0500

    Update cmake-rocm-windows.yml

commit 38fce3030afabd009f64ced3c33a46b80d482fcc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Sep 24 20:21:09 2023 -0500

    Update cmake-rocm-windows.yml

commit 3e1db6900e1a080daac5c6fc7750972be04aeaa8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Sep 24 19:55:10 2023 -0500

    Update cmake-rocm-windows.yml

commit 81063b3c56708243cc47063b48a381b9c1e7a88e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Sep 24 19:48:15 2023 -0500

    Update cmake-rocm-windows.yml

commit 32092c0055611c3dc7bfd9fd7a0636f4ad30c7b6
Merge: a45ed4f9 14295922
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Sep 23 14:07:04 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit a45ed4f9920c23503a60f0e0d66366138b8206bd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Sep 21 15:48:53 2023 -0500

    Update cmake-rocm-windows.yml

commit 2c8269239da521c6cfce21aca6b7cf7bb1941d47
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Sep 21 14:48:27 2023 -0500

    Update cmake-rocm-windows.yml

commit a89ddd3edd14258278e805cfade55af3a23be864
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Sep 21 14:08:28 2023 -0500

    Update cmake-rocm-windows.yml

commit 13924e71102c467fb56c4802305ab9d3c53e431c
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Sep 21 13:18:05 2023 -0500

    Create cmake-rocm-windows.yml

commit 0623f597f33d2fcfc55bc7134294b443e63bde76
Merge: efdd0fc3 712b8423
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Sep 20 13:51:31 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit efdd0fc391b747ce023fc92aaf9c7324959217cd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Sep 16 03:22:39 2023 -0500

    update version & fix pyinstaller files

commit 7ceec2faaf62f49928685d26b3ef3407e20e467d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Sep 16 01:18:05 2023 -0500

    fixes & add rocm_only exe build file

commit afc6f1008e77084e2c14bd94f44f0d768ca9e84a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Sep 15 19:47:10 2023 -0500

    Separate CuBLAS/hipBLAS

    Separate CuBLAS/hipBLAS

commit 44e952dd01791dfff9df259e4f084a546683d5bf
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Sep 14 15:26:51 2023 -0500

    Update easy_KCPP-ROCm_install.sh

    Remove openblas and CLBlast from this file

commit 262b1c6f2d3fd177070887abc2db900069471429
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Wed Sep 13 11:20:24 2023 +0200

    CUDA: mul_mat_q RDNA2 tunings (#2910)

    * CUDA: mul_mat_q RDNA2 tunings

    * Update ggml-cuda.cu

    Co-authored-by: Henri Vasserman <henv@hot.ee>

    ---------

    Co-authored-by: Henri Vasserman <henv@hot.ee>

commit e442209c5edf4ed815b78d2289a92e40241affe1
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Sep 13 05:23:56 2023 -0500

    Update koboldcpp.py

commit b5fb4ad6eb043bc5459b57326b657d98526f818a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Sep 13 04:00:28 2023 -0500

    Update .gitignore

commit 29d2cca0b88640d09a4db823026708c51fff79f0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Sep 13 03:07:27 2023 -0500

    changes for windows

commit 5dc22a3b685d03d3a963e7ed3423b52b423b284b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Sep 13 01:30:16 2023 -0500

    cmake update

commit 2a8ffecfccf2add9374767abc7e41224668b43bb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Sep 13 00:15:45 2023 -0500

    Windows ROCm Support

commit 3d9a25bf37e43997ccf75186bfc722b00472ed08
Merge: 0242966f 2dc96687
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Sep 7 20:45:34 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 0242966f8667c34dc0d631093f7f251f67f66921
Author: kalomaze <66376113+kalomaze@users.noreply.github.com>
Date:   Tue Sep 5 16:46:30 2023 -0500

    Proposed streaming improvements

commit 7aca9b4f4f8ad76141d7651e14110b2f8a7a30a1
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Sep 1 03:23:09 2023 -0500

    better dmmv values

commit ffe2ad3628e7e418e664e937884ab3fcfe496124
Merge: da824e9a b6914ebd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Aug 31 01:00:42 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit da824e9a9226c8137d0233d9a0a0bb818d3b09c3
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Aug 29 17:26:29 2023 -0500

    Update Makefile for misc amd gpu targeting

    adds the hipBlas gpu_target ``$(shell $(ROCM_PATH)/llvm/bin/amdgpu-arch)`` back to the gpu_target line, possibly allowing misc gpu arch's like gfx1031 or gfx1032 etc to be built

    works in conjunction with the preset targets

commit 48e2d0dce7532cdd80b53093e88d19fe7af022bf
Merge: 0d17408d cf5d9180
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Aug 28 17:52:43 2023 -0500

    Merge branch 'concedo_experimental'

commit 0d17408d4128d996d58517aced644fab0d0a2098
Merge: 3416c986 9d5b4238
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Aug 28 17:46:31 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 3416c986d9d9a31c3cdefd7e7bd4d9438d72ba35
Merge: 5eb17f02 4c4e4358
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:46:56 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 5eb17f02c8638e003bb91bddf95ccf54d2ad0c12
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Aug 25 13:38:21 2023 -0500

    ROCm Port update

    * use hipblas based on cublas
    * Update Makefile for the Cuda kernels
    * Expand arch list and make it overrideable
    * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)
    * add hipBLAS to README
    * new build arg LLAMA_CUDA_MMQ_Y
    * fix half2 decomposition
    * Add intrinsics polyfills for AMD
    * AMD assembly optimized __dp4a
    * Allow overriding CC_TURING
    * use "ROCm" instead of "CUDA"
    * ignore all build dirs
    * Add Dockerfiles
    * fix llama-bench
    * fix -nommq help for non CUDA/HIP

    ---------

    Co-Authored-By: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
    Co-Authored-By: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-Authored-By: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-Authored-By: Engininja2 <139037756+Engininja2@users.noreply.github.com>
    Co-Authored-By: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
    Co-Authored-By: jammm <2500920+jammm@users.noreply.github.com>
    Co-Authored-By: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit b34f4bd2724733e188ec4f6074042f66a5ed28c9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 19 17:12:52 2023 -0500

    Update README.md

commit 7d1196108ad330b32845546fb3472c2172a0b6b8
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Aug 14 23:03:12 2023 -0500

    remove force DMMV

commit cd61aa0d9e16627935c7978adf488a679ddfa745
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Aug 12 17:24:31 2023 -0500

    restore main_gpu parameter

commit 4a042f326830271a4c31104051b7b08e08ac234e
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Aug 12 10:51:46 2023 +0300

    gfx1100 support

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: jammm <2500920+jammm@users.noreply.github.com>
    Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

commit 8913bc6fea97d3cb860937b0461f455c6abe3ea1
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:16:02 2023 +0300

    Allow overriding CC_TURING

commit e77a4c37a756c002e97173f4122e088fb304e18a
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 10:00:07 2023 +0300

    Merge 'origin/master' into hipblas

commit cc4c4e355cd553b1557d5fba2562e824db93f9b4
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Fri Aug 11 09:43:14 2023 +0300

    New __dp4a assembly

    Now compatible with gfx900 and faster as well.

commit 1a03b709848ce68d5bf5966237756167e2cac540
Author: Henri Vasserman <henv@hot.ee>
Date:   Fri Aug 11 09:30:28 2023 +0300

    Undo mess

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit 4366ff9ba1b1f12e494118ef9b5198479022fcc5
Author: DannyDaemonic <DannyDaemonic@gmail.com>
Date:   Thu Aug 10 13:11:36 2023 -0700

    Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.

commit 811ff855a24323cafddc95c1b8aca711fef05f76
Author: Christian Demsar <crasm@git.vczf.us>
Date:   Thu Aug 10 10:28:27 2023 -0400

    Add --n-predict -2 for stopping generation on full context (#2565)

commit 37c9717aaa6815b6a5be21aaab970212f20fe6bf
Author: Martin Krasser <krasserm@googlemail.com>
Date:   Thu Aug 10 12:16:38 2023 +0200

    Fix grammar-based sampling issue in server (#2566)

commit d18ecd5b9e5dde58ae08a3eef1637406159ddaca
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Aug 10 13:19:41 2023 -0500

    make mmq gen faster for amd

commit 243894a952147a4fac5b6aee748861a0df6cc2c6
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:14:40 2023 +0300

    ws fix

commit ac2f14da445ea87d73539adbd29d19ff2c9eba58
Author: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Date:   Thu Aug 10 12:11:27 2023 +0300

    AMD assembly optimized __dp4a

    Doesn't seem to work for gfx900, so commented out.

commit 9dba0c985f140ddded8cbb671f139e81fff82eed
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Aug 10 12:09:28 2023 +0300

    Fix merge

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>

commit f570b5cb1070591527a82d94bba408927b37778d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 22:11:20 2023 -0500

    Revert "revert cuda changes as they are bugggy"

    This reverts commit 1541bf879772aeeed8ff646bfc52185c2a88b79b.

commit 1541bf879772aeeed8ff646bfc52185c2a88b79b
Author: Concedo <39025047+LostRuins@users.noreply.github.com>
Date:   Wed Aug 9 22:36:41 2023 +0800

    revert cuda changes as they are bugggy

commit bacc20203efb1839aa313858a04d75255bb4b7f4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:37:17 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit b7cb4cfd109986bd66e8fd382d1e2516eaddfebb
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 20:00:52 2023 -0500

    additional fixes

commit fadae727baa3735ad3e0667384d6e05ca056b3ef
Merge: 518eb2af 8f8ab6c4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:45:50 2023 -0500

    Merge branch 'hipblas' into develop4Main

commit 518eb2af9225f8300a108c4244c7eb0a2217c3bc
Merge: bda0215b cae6a847
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:32:10 2023 -0500

    Merge remote-tracking branch 'upstream/concedo' into develop2Main

commit bda0215b413bafc49890aa23fc35f96a191fb3e0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:17:54 2023 -0500

    update makefile to multisystem path

commit 8f8ab6c4c049df501e9a5ed8fef3aa0fc0691421
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 9 18:05:03 2023 -0500

    hipLDFLAG Path change Unix to multisystem in Makefile

    changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS

commit 610ba4cfc460ed65c4adc32d3365a216690384d5
Merge: 4024f91a 25d43e0e
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 23:54:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 4024f91a665d83b6de8658d45ec9d004c5d90c79
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 01:56:44 2023 +0300

    Add intrinsics polyfills for AMD

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
    Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com>
    Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com>

commit ab6212864ce8e9af200bcedb3e0126ee49aa8d0a
Merge: d91456aa f5bfea05
Author: Henri Vasserman <henv@hot.ee>
Date:   Wed Aug 9 00:37:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ee9fa2aca4f2e6645b99702935b34a5f8ec8f05d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Aug 2 01:53:58 2023 -0500

    Update Makefile

commit d91456aaf138566fa0aa3d507964049c8a09499b
Author: ardfork <134447697+ardfork@users.noreply.github.com>
Date:   Mon Jul 31 20:35:00 2023 +0300

    fix half2 decomposition

commit c1cb70d64d307d3fd9b7b9f61bb574e36520499a
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 31 19:56:44 2023 +0300

    new build arg LLAMA_CUDA_MMQ_Y

commit c1664a00ae98059df863a88cbcb13eeca3025742
Merge: 4336231a 0728c5a8
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 31 19:32:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 848558d7d95a5036ac057efdefa9b2a2e6fb61b7
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 30 20:02:52 2023 -0500

    import vars logic fix

commit b650b849d52aac65364558521f76e75ded7ea590
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 30 00:21:36 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 8573a67a29e813d82e7f032912a8c221cd199505
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 21:31:12 2023 -0500

    remove duplicate code and fix typo

    remove duplicate tooltip

commit 430986e3f68f599fd7a11ea4b2b8e45ef33da643
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 21:07:34 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available
    " if len(runopts)==6 else + "

commit dd0db7265dbc0b0699ca861291006808b662b0e4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 20:52:31 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available

commit 43fffb66d8a30cbd776c3682f8a104c3644206b1
Merge: 0ed65a44 b40550cf
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 19:13:15 2023 -0500

    Merge branch 'concedo'

commit 0ed65a44a5fdb529611730f276a4b910cbf70ae0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 18:34:21 2023 -0500

    Hide unavailable backends & Add tooltip over backend count

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

    Add tooltip when hovering over backend count label

    hovering over the new label that shows the backend count will explain what the numbers are, and show the users which backends are not available or built

commit 2a263983ab35024a95c411995963182ada06ed6f
Merge: cee2e9d7 31486ebc
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 29 15:16:33 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 4336231a32a0c6168da5d79801752289622e9e58
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 18:35:56 2023 +0300

    add hipBLAS to README

    ---------

    Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>

commit f8e3fc6c746b37d69656fb5ae6af8e411d85dbca
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 14:16:46 2023 +0300

    rocblas init stuff

commit d2ade639f4339e786311effb3eafca8bfc360d56
Merge: cde52d6a 8a88e585
Author: Henri Vasserman <henv@hot.ee>
Date:   Sat Jul 29 12:59:48 2023 +0300

    Merge 'origin/master' into hipblas

commit cee2e9d76740fd8e8f50b612078f3e7658460f29
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 26 23:36:55 2023 -0500

    Only Show Available Backends in GUI

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

commit 78636109fc2ded79ee3e9a44d2e3c2d63a8de70e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 26 13:27:22 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 731cd6e2ab9bb722e211142bb633e7018ccdb31b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jul 25 22:39:50 2023 -0500

    Create easy_rocm_install.sh

commit f154685bbdc79b5ace752fbc179e32f2f7806bdb
Merge: cbdc1f3f 94e0a06d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Tue Jul 25 22:25:10 2023 -0500

    Merge branch 'concedo_experimentalMAIN'

commit cbdc1f3fb91969e79bc8640e0cebfc3247e200df
Merge: 5b838d47 9731682a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 16:53:21 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cde52d6a63f13f46d6403cc2957f4b4c34ddf4e2
Merge: 8e8054ad 84e09a7d
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 24 12:22:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 8e8054ad83e794b261914ad4f337d43e2c76882d
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 24 12:20:49 2023 +0300

    Add rocblas to build files

commit 1f6294dc4473701b5be791d47e4b3733f95dbc0a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:52:01 2023 -0500

    Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)

    * initialize rocblas

commit 5b838d47874536ebffc2f6cb25877e0476a9402d
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:10:35 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 9bfb2fdd68000670bda85c4e9748d72f5af09764
Merge: b379f9d6 66328fcd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:07:44 2023 -0500

    Merge branch 'concedo_experimental'

commit b379f9d6fac570c220c928ff5f4ba4ed1ca7c051
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 03:07:00 2023 -0500

    Revert "amd multigpu full layer offload w/o vram scratch"

    This reverts commit 9adfc8e33f7116d6ae2e0992920733f783b70d08.

commit 9adfc8e33f7116d6ae2e0992920733f783b70d08
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 02:56:40 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 05c792e622a1d9838f9343e04f79ddf2bb63ae96
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 24 00:18:48 2023 -0500

    initialize rocblas

commit ade68d09d7b63d3344e18b6193043b378671eb12
Merge: 521ad6b5 56995caa
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 23 20:25:05 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 521ad6b5cb2a107ad7b972025aeb0f353e0cac67
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jul 20 21:42:33 2023 -0500

    lazy import_var error handling for saves

commit 9553e52e7e4eabe46312729f6c4effeef6390df7
Merge: cac66507 f0361091
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jul 20 19:59:41 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cac6650754502208abfead61ba169fefc5ae84ac
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 17 23:05:02 2023 -0500

    Makefile fix! Allows hip/clblast build together

commit 3db70b5f0a1a4a1207041ddc5f2c5e25306bad4d
Merge: 2ec4466d 7568d1a2
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 18 01:54:17 2023 +0300

    Merge 'origin/master' into hipblas

commit f208670ffb6cdbb1e225adfb2fd80a67a6dc5055
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 14 02:56:03 2023 -0500

    improve error handling with gpu names

commit 860e73845f61fe0afb6a26cc8054d8be1f9e3669
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 14 00:33:03 2023 -0500

    Show GPU names in GUI, Only show GPUs that exist

    changed the pre-set 1,2,3 and 1,2,3,all settings that the GPU selector had and replaced them with a function that grabs the GPU names and sets the names as the values for the selector boxes.

commit 2ec4466db54fd2f42f2ab7713cc1061e0cf59bf3
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jul 13 13:44:02 2023 +0300

    Update build flags.

    GGML_CUDA_DMMV_Y is now GGML_CUDA_MMV_Y
    so update your build instructions.

    GGML_CUDA_FORCE_DMMV is always enabled.

    ---------

    Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>

commit cd36b185ff6de91abbfd1b80366dd79a1303a878
Merge: afcb8fe0 1cbf5614
Author: Henri Vasserman <henv@hot.ee>
Date:   Thu Jul 13 13:03:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ac7ebc3ac1deedfbc2940443b26774f1b4c85fae
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 18:32:18 2023 -0500

    add hipBLAS name scheme to GUI and update README

commit 7f85cc5ac30f2f300ca817a489ef209c995c634b
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 17:35:54 2023 -0500

    update makefile and ggml.c

commit 6ca3499275ba168320424f06ab3301ec329a6a83
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 15:43:45 2023 -0500

    ggml.c fix

commit 770e674aa5b2a1a9ffff2888a12e27b04ccfc7ef
Merge: 2b289cde 5941514e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 15:24:36 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2b289cde558310c6c67dfc8d508c04e634595716
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:30:00 2023 -0500

    Update c-cpp.yml

commit 5dae95a9bb486c7f720789dffde1cfb470bffce0
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:28:51 2023 -0500

    Update c-cpp.yml

commit b37cd738c84debb53b149f5a9fb73de958f263fd
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 12 14:27:04 2023 -0500

    Create c-cpp.yml to test Actions

commit afcb8fe0c4f5e918422ea41d08824653d58575ed
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 18:09:27 2023 +0300

    Add new config option

commit 8c2c4978a32d671253809d8f0f09d98af2dd18ab
Merge: e6104663 23474632
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 17:53:54 2023 +0300

    Merge 'origin/master' into hipblas

commit e610466307abc8f8bae641682ab3f91dbc33930e
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 11 17:53:14 2023 +0300

    Expand arch list and make it overrideable

commit 80e4e548bfbace2a966a58cb57dd1720ad7216b2
Merge: 7735c5a9 1d163099
Author: Henri Vasserman <henv@hot.ee>
Date:   Mon Jul 10 02:09:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 8432e9d5dc8d080535243467f8d380271e8d9489
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 9 16:55:30 2023 -0500

    Update Makefile

commit b58c1893fa839c0f35df96f6a8b026a7f2576762
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 9 16:20:00 2023 -0500

    Add multi-gpu CuBLAS support to new GUI

commit 0c1c71b9927127b45030fe88283dfbdd23853d34
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 8 07:56:57 2023 -0500

    Update Makefile

commit f864f60cd8e563e2594cee5a7da7e9aebed494f9
Author: Johannes Gäßler <johannesg@5d6.de>
Date:   Sat Jul 8 00:25:15 2023 +0200

    CUDA: add __restrict__ to mul mat vec kernels (#2140)

commit 4539bc2761a7a23b588b5420b9d3fd1962ff63e5
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 8 01:36:14 2023 -0500

    update makefile for changes

commit 912e31ec523eac9ef308f0d28bc2d93aab7c3ecb
Merge: 74e2703a ddaa4f2a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Fri Jul 7 23:15:37 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 74e2703ac3b1557f107e540657d0919db115f913
Merge: cf65429c f9108ba4
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Wed Jul 5 15:16:49 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit 7735c5a9af58f6713b54fd5a4b6463f3b116d44d
Merge: c3e3733c 7ee76e45
Author: Henri Vasserman <henv@hot.ee>
Date:   Tue Jul 4 17:09:16 2023 +0300

    Merge 'origin/master' into hipblas

commit cf65429c3832d32a8c17c7ed5ab47066d7511fbe
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 16:56:40 2023 -0500

    print cuda or opencl based on what's used

commit 72c16d2310b2e4c44018e2084aeb79e68c0b8709
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 16:45:39 2023 -0500

    Revert "fix my mistake that broke other arches"

    This reverts commit 777aed5e69e240a54e7d3da962d8520855f072b9.

commit 777aed5e69e240a54e7d3da962d8520855f072b9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Mon Jul 3 15:53:32 2023 -0500

    fix my mistake that broke other arches

commit 27780a987a8dabb18689038c0397e16f2f219c7e
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 16:03:27 2023 -0500

    rocm fixes

commit f52c7d439770c1ea0bebc1f895b74d6aeea5f0a6
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 16:02:58 2023 -0500

    Revert "rocm fixes"

    This reverts commit 2fe9927353a1e53353623f850d3d534da88f5154.

commit 2fe9927353a1e53353623f850d3d534da88f5154
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:58:21 2023 -0500

    rocm fixes

commit efe7560c83a497f5e750bbe27922babd4233bda9
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:55:43 2023 -0500

    Revert "move HIPBLAS definitions into ggml-cuda.h"

    This reverts commit bf49a93d63f833b7871ba6e60f8fe207562678ee.

commit 4fc0181e44685019dcd309d4bb345cac7a5fef87
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 15:55:36 2023 -0500

    Revert "move hipblas definitions to header files"

    This reverts commit 2741ffb70464a71fd138484de4b41da05622e027.

commit 89eb576f2771bd81a3a6274348b47535dfdd5f63
Merge: 2741ffb7 3d2907d2
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sun Jul 2 14:44:13 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c3e3733c61f7705ea00fd593ee94527da8c12f1b
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jul 2 15:51:31 2023 +0300

    ROCm fixes

commit 15db19ae7b70d2a6350063e633b898a89ad78cbc
Merge: 04419f18 46088f72
Author: Henri Vasserman <henv@hot.ee>
Date:   Sun Jul 2 15:39:57 2023 +0300

    Merge 'origin/master' into hipblas

commit 2741ffb70464a71fd138484de4b41da05622e027
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 17:07:42 2023 -0500

    move hipblas definitions to header files

commit bf49a93d63f833b7871ba6e60f8fe207562678ee
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 16:38:50 2023 -0500

    move HIPBLAS definitions into ggml-cuda.h

commit 540f4e05f4e95378f46a83e2919d3962c0ef9eac
Merge: 2c3b46f8 eda663f1
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Sat Jul 1 14:58:32 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2c3b46f8a80ca9d94b2d3d06e1af6b6f7b791914
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 18:43:43 2023 -0500

    changes to fix build

commit c9e1103da0d72fd39a36391ac4b5d941a133598a
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 18:20:07 2023 -0500

    Update ggml_v2-cuda-legacy.cu for ROCM

commit b858fc5db80ed545a6fbeae3d551bddb47955598
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date:   Thu Jun 29 17:49:39 2023 -0500…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants