py : minor fixes #5668

ggerganov · 2024-02-22T17:23:52Z

close #5667

Also add comments for a seemingly wrong norm for Orion models

cebtenzzre · 2024-02-22T18:04:41Z

This does successfully convert persimmon. Two things I find strange about persimmon:

The BOS and EOS tokens are both the same token which maps to |ENDOFTEXT|, and add_bos_token is explicitly true in tokenizer_config.json
Attempting to run it with the CUDA backend fails an assertion in ggml_cuda_rope: ggml_is_contiguous(src0)

ggerganov · 2024-02-22T18:13:51Z

The Orion graph needs to be revisited - I think it is doing a lot of unnecessary stuff

py : minor fixes

56c0471

ggerganov requested a review from cebtenzzre February 22, 2024 17:23

cebtenzzre approved these changes Feb 22, 2024

View reviewed changes

ggerganov merged commit 5a9e2f6 into master Feb 22, 2024
24 of 26 checks passed

cebtenzzre mentioned this pull request Mar 1, 2024

persimmon crashes with CUDA: assertion failure ggml_is_contiguous(src0) #5823

Open

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

py : minor fixes (ggerganov#5668)

437b419

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

py : minor fixes (ggerganov#5668)

4482803

Provide feedback