-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support cpm on xpu #76
Conversation
@@ -0,0 +1,19 @@ | |||
from config_common import * | |||
|
|||
dist_backend = "xccl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1*1的配置如果没有跑实验,运行情况一栏没有写相关数据,可以不提供这个配置,
@@ -0,0 +1,21 @@ | |||
from config_common import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
@@ -0,0 +1,19 @@ | |||
from config_common import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上,如果不是实际运行的1*4配置,这边可以不用给出。主要是担心,没有验证的配置容易不符合预期
return model | ||
|
||
|
||
def remap_attn_parameters(model_dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数是不是不需要啦
dist_backend = "xccl" | ||
|
||
use_env = True | ||
target_embedding_average = 0.92 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
target_embedding_average和gradient_accumulation_steps,和_base.py中的配置一致,可以删除。
warmup = 0.2 | ||
learning_rate = 0.0005 | ||
|
||
beta_1: float = 0.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beta_1, beta_2, eps值没有改变的,可以删除。
No description provided.