-
Notifications
You must be signed in to change notification settings - Fork 77
[model] Add deepseek model. #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3b6cdca to
24f96d9
Compare
24f96d9 to
d5d5e6c
Compare
d5d5e6c to
4bfe3d9
Compare
| config["llama"]["layernorm_eps"] = str(hf_config.get("rms_norm_eps", 1e-6)) | ||
| config["llama"]["layernorm_type"] = "pre_layernorm" | ||
| config["llama"]["activation_type"] = "silu" | ||
| config["llama"]["activation_type"] = str(hf_config["hidden_act"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把LLaMa和deepseek的code分开来,llama的code尽量不要动,设计到deepseek的可以创建新的文件,但需要集成llama的code。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
还是一样的问题, deepseek模型结构就是复用了llama的"LlamaForCausalLM", 建议还是复用llama的代码. 添加对llama 其它RoPE类型的补充.
|
|
||
| private: | ||
| static bool initialized; | ||
| bool initialized = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要static的,你这sin和cos会有多份相同的实例,但一个model只需要一份sin和cos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sin / cos 指向的内存空间只会在第一次初始化. 这部分的buffer 由 ctx 上下文的内存池来维护.
emb_cos = ctx->getBuffer(emb_cos_str, max_position_embeddings * inv_freq_size);
emb_sin = ctx->getBuffer(emb_sin_str, max_position_embeddings * inv_freq_size);
|
|
||
| private: | ||
| static bool initialized; | ||
| bool initialized = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要static的,你这sin和cos会有多份相同的实例,但一个model只需要一份sin和cos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上.
|
|
||
| for (size_t j = 0; j < inv_freq_size; j++) { | ||
| float tmp = i * inv_freq[j]; | ||
| float tmp = i * inv_freq[j] / this->scaling_factor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把LLaMa和deepseek的code分开来,llama的code尽量不要动,涉及到deepseek的可以创建新的文件,但需要集成llama的code。这个你可以新建一个deepseek的rope文件。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deepseek 类似 Yi 的方式, 直接复用的llama模型结构, LinearScaling rope的实现也是在llama model内支持的.
config.json: https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct/blob/main/config.json#L3
LinearScaling rope: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L148-L155
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This scaling_factor is LLaMa param.
benchmark/benchmark.py
Outdated
| if "chatglm3" in args.model_name.lower(): | ||
| model_prompt = prompt_pool["chatglm3"] | ||
| if "llama" in args.model_name.lower(): | ||
| if "llama" in args.model_name.lower() or "deepseek" in args.model_name.lower(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单独 if deepseek
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deepseek在结构层面和llama一致, 建议复用llama的path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add if
3c30778 to
4730f04
Compare
|
|
||
| for (size_t j = 0; j < inv_freq_size; j++) { | ||
| float tmp = i * inv_freq[j]; | ||
| float tmp = i * inv_freq[j] / this->scaling_factor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This scaling_factor is LLaMa param.
README.md
Outdated
| Supported model convert list: | ||
| - LlamaConvert | ||
| - DeepseekConvert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it down
benchmark/benchmark.py
Outdated
| if "chatglm3" in args.model_name.lower(): | ||
| model_prompt = prompt_pool["chatglm3"] | ||
| if "llama" in args.model_name.lower(): | ||
| if "llama" in args.model_name.lower() or "deepseek" in args.model_name.lower(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add if
src/xfastertransformer/__init__.py
Outdated
| "automodel": ["AutoModel"], | ||
| "tools": [ | ||
| "LlamaConvert", | ||
| "DeepseekConvert", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it down
No description provided.