Requesting Qwen-7B Support #2528

aiaicode · 2023-08-05T17:23:41Z

https://huggingface.co/Qwen/Qwen-7B

https://huggingface.co/Qwen/Qwen-7B-Chat

These two models are performing better than 13B models. and on C-eval beating ChatGPT.

Requesting model support for Llama.cpp.

vonjackustc · 2023-08-07T00:36:30Z

Qwen is similar to Llama model.
You need add bias params for QKV, and change the tokenizer to tiktoken.

I wonder if we can convert tiktoken vocab format to sentencepiece directly.

vonjackustc · 2023-08-10T00:50:43Z

https://huggingface.co/JosephusCheung/Qwen-LLaMAfied-7B-Chat
This repo from JosephusCheung can be converted into ggml format. You still need to change the EOS token from ggml source code (2 for llama but other for Qwen).

I think this model is still slightly different from the original Qwen especially in QwenAttentionBlock, Qwen applied log(n) to query.

wtarreau · 2023-09-10T09:03:11Z

Sadly I couldn't convert it, always getting cryptic python errors.

arch-btw · 2023-09-26T02:50:50Z

@aiaicode @vonjackustc @wtarreau

#3337

KerfuffleV2 · 2023-10-23T14:32:19Z

I found this script for converting from Tiktoken vocab to HF: https://gist.github.com/xenova/a452a6474428de0182b17605a98631ee (didn't test it, but it looks reasonable and seems to be from a HF person.)

To actually use it, you'll also need to use #3743 since there wasn't already support for loading from merges.txt.

@TheBloke too since I assume you're looking to support these Qwen models.

ggerganov · 2023-11-30T22:04:42Z

Is there a blocker to support these models in llama.cpp? Would be nice to support the new 1.8B and 72B versions

simveit · 2023-12-01T00:36:30Z

https://github.com/QwenLM/qwen.cpp
can this maybe be of any help to support qwen in the future?

choyakawa · 2023-12-01T05:08:00Z

We should first support QKVO bias in llama.
It's in hf llama config.json: "attention_bias": true

choyakawa · 2023-12-01T05:10:56Z

There was an implementation here but failed, can anyone figure out what's wrong with this?
#3743 (comment)
It seems correct just to add bias terms to each parts.

ggerganov · 2023-12-01T05:56:40Z

The bias tensors were not offloaded via a separatecb() call
Seems like special tokens (<|im_start|>, <|im_end|>) were not escaped in their test?

DavidGOrtega · 2024-01-24T03:37:31Z

should this be closed by #5037 ?

shibe2 mentioned this issue Oct 23, 2023

when support Qwen-7b #3736

Closed

tgaddair mentioned this issue Dec 5, 2023

Is there any plan to support dynamic lora for qwen/chatglm models? predibase/lorax#101

Open

ggerganov closed this as completed Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requesting Qwen-7B Support #2528

Requesting Qwen-7B Support #2528

aiaicode commented Aug 5, 2023 •

edited

Loading

vonjackustc commented Aug 7, 2023

vonjackustc commented Aug 10, 2023

wtarreau commented Sep 10, 2023

arch-btw commented Sep 26, 2023

KerfuffleV2 commented Oct 23, 2023

ggerganov commented Nov 30, 2023

simveit commented Dec 1, 2023

choyakawa commented Dec 1, 2023 •

edited

Loading

choyakawa commented Dec 1, 2023

ggerganov commented Dec 1, 2023

DavidGOrtega commented Jan 24, 2024

Requesting Qwen-7B Support #2528

Requesting Qwen-7B Support #2528

Comments

aiaicode commented Aug 5, 2023 • edited Loading

vonjackustc commented Aug 7, 2023

vonjackustc commented Aug 10, 2023

wtarreau commented Sep 10, 2023

arch-btw commented Sep 26, 2023

KerfuffleV2 commented Oct 23, 2023

ggerganov commented Nov 30, 2023

simveit commented Dec 1, 2023

choyakawa commented Dec 1, 2023 • edited Loading

choyakawa commented Dec 1, 2023

ggerganov commented Dec 1, 2023

DavidGOrtega commented Jan 24, 2024

aiaicode commented Aug 5, 2023 •

edited

Loading

choyakawa commented Dec 1, 2023 •

edited

Loading