Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Bert model && case readme #47

Merged
merged 2 commits into from
Apr 20, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
update links
  • Loading branch information
zhouyu committed Mar 30, 2023
commit 564f1c7b90706361500117c4fa28f1bea0bfa9e2
11 changes: 4 additions & 7 deletions training/benchmarks/bert/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@
BERT stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.
BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

Please refer to this paper for a detailed description of BERT:
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)
Please refer to this paper for a detailed description of BERT:
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)


### 模型代码来源
[Bert MLPerf](https://github.com/mlcommons/training_results_v1.0/tree/master/NVIDIA/benchmarks/bert/implementations)


### 模型Checkpoint下载

Expand Down Expand Up @@ -87,7 +87,4 @@ python3 pick_eval_samples.py \
### 框架与芯片支持情况
| | Pytorch |Paddle|TensorFlow2|
| ---- | ---- | ---- | ---- |
| Nvidia GPU | N/A |✅ |N/A|



| Nvidia GPU | N/A |[✅](../../nvidia/bert-paddle/README.md) |N/A|
11 changes: 5 additions & 6 deletions training/nvidia/bert-paddle/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,18 +42,18 @@ export LOCAL_RANK=0
在该路径目录下

```
python run_pretraining.py
python run_pretraining.py
--data_dir data_path
--extern_config_dir config_path
--extern_config_file config_file.py
```

example:
```
python run_pretraining.py
--data_dir /ssd2/yangjie40/data_config
--extern_config_dir /ssd2/yangjie40/flagperf/training/nvidia/bert-pytorch/config
--extern_config_file config_A100x1x2.py
python run_pretraining.py
--data_dir /ssd2/yangjie40/data_config
--extern_config_dir /ssd2/yangjie40/flagperf/training/nvidia/bert-pytorch/config
--extern_config_file config_A100x1x2.py
```


Expand All @@ -79,7 +79,6 @@ python run_pretraining.py
| 单机2卡 | config_A100x1x2 | N/A | 0.67 | N/A | N/A | N/A |
| 单机4卡 | config_A100x1x4 | 1715.28 | 0.67 | 0.6809 | 6250 | 180.07 |
| 单机8卡 | config_A100x1x8 | 1315.42 | 0.67 | 0.6818 | 4689 | 355.63 |
| 两机8卡 | config_A100x2x8 | N/A | 0.67 | N/A | N/A | N/A |

### 许可证

Expand Down