-
Notifications
You must be signed in to change notification settings - Fork 106
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #76 from ScoThunder/cpm
support cpm on xpu
- Loading branch information
Showing
8 changed files
with
113 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
### 模型Checkpoint下载 | ||
[模型Checkpoint下载](../../benchmarks/cpm/README.md#模型checkpoint) | ||
### 测试数据集下载 | ||
[测试数据集下载](../../benchmarks/cpm/README.md#数据集) | ||
|
||
### 昆仑芯XPU配置与运行信息参考 | ||
#### 环境配置 | ||
- ##### 硬件环境 | ||
- 机器型号: 昆仑芯AI加速器组R480-X8 | ||
- 加速卡型号: 昆仑芯AI加速卡R300 | ||
- 多机网络类型、带宽: InfiniBand,200Gb/s | ||
|
||
- ##### 软件环境 | ||
- OS版本:Ubuntu 20.04 | ||
- OS kernel版本: 5.4.0-26-generic | ||
- 加速卡驱动版本:4.0.25 | ||
- Docker镜像和版本:pytorch1.12.1-cpu-ubuntu18.04:v0.04 | ||
- 训练框架版本:xmlir+e70db8f6 | ||
- 依赖软件版本:pytorch-1.12.1+cpu | ||
|
||
|
||
### 运行情况 | ||
| 训练资源 | 配置文件 | 运行时长(s) | 目标精度 | 收敛精度 | Steps数 | 性能(samples/s) | | ||
| -------- | --------------- | ----------- | -------- | -------- | ------- | ---------------- | | ||
| 单机1卡 | config_R300x1x1 | | | | | | | ||
| 单机2卡 | config_R300x1x2 | | | | | | | ||
| 单机4卡 | config_R300x1x4 | | | | | | | ||
| 单机8卡 | config_R300x1x8 | | 0.92 | 0.9235 | 632 | | | ||
| 两机8卡 | config_R300x2x8 | | | | | | | ||
|
||
### 许可证 | ||
|
||
Apache 2.0 license。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
from config_common import * | ||
|
||
dist_backend = "xccl" | ||
|
||
train_batch_size = 32 | ||
eval_batch_size = train_batch_size | ||
max_steps = 4000 | ||
max_samples_termination = 4391260 | ||
|
||
warmup = 0.2 | ||
learning_rate = 0.0005 | ||
|
||
seed = 23333 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# DDP type: 'apex' or 'native'. | ||
ddp_type: str = "native" | ||
|
||
# disable fp16 | ||
fp16 = False | ||
|
||
vendor = 'kunlunxin' |
21 changes: 21 additions & 0 deletions
21
training/kunlunxin/cpm-pytorch/config/environment_variables.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# ================================================= | ||
# Export variables | ||
# ================================================= | ||
set -x | ||
|
||
export XMLIR_F_XPU_ENABLED_BOOL=true | ||
export XMLIR_TORCH_XCCL_ENABLED=true | ||
|
||
##===----------------------------------------------------------------------===## | ||
## R480 config | ||
##===----------------------------------------------------------------------===## | ||
|
||
# BKCL | ||
topo_file=${WORKSPACE-"."}/topo.txt | ||
touch topo_file | ||
export XPUSIM_TOPOLOGY_FILE=$(readlink -f $topo_file) | ||
|
||
## workaround due to ccix bug | ||
export BKCL_CCIX_RING="1" | ||
export ALLREDUCE_ASYNC="0" | ||
export ALLREDUCE_FUSION="0" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
from model.models import gpt2_get_params_for_weight_decay_optimization | ||
|
||
from torch import nn | ||
from torch.optim import Optimizer | ||
from typing import Tuple | ||
from driver.dist_pytorch import main_proc_print | ||
|
||
|
||
def convert_model(config, model: nn.Module) -> nn.Module: | ||
return model | ||
|
||
|
||
def create_optimizer(config, model): | ||
param_groups = gpt2_get_params_for_weight_decay_optimization(model) | ||
from torch.optim import Adam | ||
optimizer = Adam(param_groups, | ||
lr=config.learning_rate, | ||
weight_decay=config.weight_decay_rate) | ||
|
||
return optimizer | ||
|
||
|
||
def model_to_fp16(config, model: nn.Module, | ||
optimizer: Optimizer) -> Tuple[nn.Module, Optimizer]: | ||
return model, optimizer |