PaddlePaddle Hackathon 56 提交 #1088

JunnYu · 2021-09-24T10:27:12Z

Task: #1074

权重文件等百度云上传，链接：https://pan.baidu.com/s/1dyaOShLEgnL_RJW4sbs40g 提取码：kayr。
模型中有的layernorm未设置eps=1e-5。
删除GPTEmbeddings中paddle.ParamAttr的name属性，设置了话会报错，提示说重复使用了相同的名字，单元测试无法通过，jupyter notebook中无法重复初始化。
添加compare.py比较转换后的模型预测结果。（该文件所处位置等审核通过后再修改）。
添加convert.py转换脚本。（该文件所处位置等审核通过后再修改）。
microsoft-DialoGPT-small转换后的误差有点大，问题与之前的应该类似，其他的两个模型转换后误差正常。（使用了相同的转换代码，因此不存在转换时代码的错误。）
添加GPTForTokenClassification, GPTForSequenceClassification这2个类，并添加注释。
添加单元测试代码。TestElectraForSequenceClassification，TestGPTForTokenClassification。
转换代码时候并未添加lm_head.weight，因为是它是与word embedding绑定的，所以没必要转换它，如果有需要可自行修改。

paddlenlp/transformers/gpt/modeling.py

yingyibiao · 2021-09-27T11:46:14Z

merges.txt也通过百度云上传

JunnYu · 2021-09-27T13:03:58Z

@yingyibiao baidu网盘包含merges.txt，现在已经删除community/junnyu所有文件。

yingyibiao · 2021-10-13T11:01:56Z

@yingyibiao baidu网盘包含merges.txt，现在已经删除community/junnyu所有文件。
还是需要在community/junnyu中上传相关文件，具体可以参考：
https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

yingyibiao · 2021-10-18T04:39:49Z

Task: #1074

权重文件等百度云上传，链接：https://pan.baidu.com/s/1dyaOShLEgnL_RJW4sbs40g 提取码：kayr。

模型中有的layernorm未设置eps=1e-5。

删除GPTEmbeddings中paddle.ParamAttr的name属性，设置了话会报错，提示说重复使用了相同的名字，单元测试无法通过，jupyter notebook中无法重复初始化。

添加compare.py比较转换后的模型预测结果。（该文件所处位置等审核通过后再修改）。

添加convert.py转换脚本。（该文件所处位置等审核通过后再修改）。

microsoft-DialoGPT-small转换后的误差有点大，问题与之前的应该类似，其他的两个模型转换后误差正常。（使用了相同的转换代码，因此不存在转换时代码的错误。）

添加GPTForTokenClassification, GPTForSequenceClassification这2个类，并添加注释。

添加单元测试代码。TestElectraForSequenceClassification，TestGPTForTokenClassification。

转换代码时候并未添加lm_head.weight，因为是它是与word embedding绑定的，所以没必要转换它，如果有需要可自行修改。

权重已上传至bos

yingyibiao · 2021-10-18T04:41:20Z

参照 #1085 的review意见修改类似问题～

community/junnyu/uer-gpt2-chinese-poem/README.md

community/junnyu/microsoft-DialoGPT-small/README.md

yingyibiao · 2021-10-18T12:13:58Z

顺便将 DialoGPT-medium，DialoGPT-large 这两个权重也导入～

yingyibiao · 2021-10-19T03:30:56Z

https://github.com/PaddlePaddle/PaddleNLP/blob/develop/docs/model_zoo/transformers.rst
这个文件也需要同步修改

ZHUI

非常不错！
看 GPTForTokenClassification GPTForSequenceClassification 是否也能给 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/language_model/gpt/ 中添加一些实际使用的例子

paddlenlp/transformers/gpt/modeling.py

ZHUI · 2021-10-19T08:45:11Z

paddlenlp/transformers/gpt/modeling.py

+        return logits
+
+
+class GPTForSequenceClassification(GPTPretrainedModel):


GPTForTokenClassification
GPTForSequenceClassification
这两个东西能否加一个例子到 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/language_model/gpt/ 中

GPTForTokenClassification 跟 BertForTokenClassification基本一样的，只是模型不需要输入token type id.

GPTForSequenceClassification 现已完成

现已添加使用GPTForTokenClassification进行NER的例子，发现效果很差。

JunnYu · 2021-10-22T11:14:49Z

JunnYu · 2021-10-22T11:35:20Z

@yingyibiao
large和medium的权重在这
链接：https://pan.baidu.com/s/1cdqfjHVrE6CwBvYW5692rg
提取码：83jh

docs/model_zoo/transformers.rst

yingyibiao · 2021-10-24T07:18:58Z

@yingyibiao
large和medium的权重在这
链接：https://pan.baidu.com/s/1cdqfjHVrE6CwBvYW5692rg
提取码：83jh

已上传～

yingyibiao · 2021-10-24T09:08:49Z

LGTM for community related files

ZHUI

LGTM

ZHUI · 2021-11-01T10:45:43Z

examples/language_model/gpt/README.md

+Precision                     | 0.484939    |
+Recall                        | 0.634716    |
+F1                            | 0.549810    |
+


这效果感觉可能有些偏低。

ZeyuChen · 2021-11-02T06:56:43Z

examples/language_model/gpt/README.md

+
+基于`gpt-cpm-small-cn-distill`在MSRA的NER任务上Fine-tuning后，在验证集上有如下结果：
+
+ Metric                       | Result      |


GPT可以解决TokenClassification问题这个事有相关paper佐证吗？@ZHUI

yingyibiao

LGTM

JunnYu added 2 commits September 24, 2021 18:17

update

cf2d21c

Merge branch 'develop' into update_gpt

00cf50a

JunnYu mentioned this pull request Sep 24, 2021

【PaddlePaddle Hackathon】任务总览 PaddlePaddle/Paddle#35940

Closed

JunnYu added 2 commits September 24, 2021 18:29

update

f4827d0

Merge branch 'update_gpt' of github.com:JunnYu/PaddleNLP into update_gpt

8e6bf70

ZHUI self-requested a review September 26, 2021 02:36

ZHUI requested changes Sep 26, 2021

View reviewed changes

paddlenlp/transformers/gpt/modeling.py Outdated Show resolved Hide resolved

ZeyuChen added the Hackathon label Sep 26, 2021

remove community/junnyu

ae4b490

yingyibiao assigned ZHUI Sep 28, 2021

Update test_modeling.py

5b66fa2

JunnYu added 3 commits October 13, 2021 23:40

Merge branch 'develop' into update_gpt

4b23a50

suggestion from ZHUI

540dfe0

add community/junnyu

c156b0a

JunnYu requested a review from ZHUI October 13, 2021 16:25

rm gpt link

68b33cd

yingyibiao reviewed Oct 18, 2021

View reviewed changes

community/junnyu/uer-gpt2-chinese-poem/README.md Outdated Show resolved Hide resolved

community/junnyu/microsoft-DialoGPT-small/README.md Outdated Show resolved Hide resolved

Merge branch 'develop' into update_gpt

563aee0

ZHUI reviewed Oct 19, 2021

View reviewed changes

yingyibiao and others added 3 commits October 20, 2021 19:03

Merge branch 'develop' into update_gpt

bf8ea3d

Merge branch 'develop' into update_gpt

8f5c136

update

52b5d0c

JunnYu added 2 commits October 22, 2021 19:19

update readme

d2720ac

add large medium

aa6f747

JunnYu added 2 commits October 22, 2021 19:40

更新权重个数

c9fcf6b

update gpt compare

94f45e9

yingyibiao reviewed Oct 24, 2021

View reviewed changes

docs/model_zoo/transformers.rst Outdated Show resolved Hide resolved

docs/model_zoo/transformers.rst Outdated Show resolved Hide resolved

update docs

23b5469

yingyibiao requested a review from ZHUI October 25, 2021 01:55

yingyibiao and others added 3 commits October 25, 2021 11:04

Merge branch 'develop' into update_gpt

fd8c51b

add msra ner example

3f74bb9

Merge branch 'develop' into update_gpt

2e430d3

ZHUI approved these changes Nov 1, 2021

View reviewed changes

yingyibiao and others added 2 commits November 1, 2021 18:56

Merge branch 'develop' into update_gpt

e8d23a1

Merge branch 'develop' into update_gpt

cbe10ac

ZeyuChen reviewed Nov 2, 2021

View reviewed changes

Merge branch 'develop' into update_gpt

5139361

yingyibiao approved these changes Nov 9, 2021

View reviewed changes

yingyibiao merged commit 844f24c into PaddlePaddle:develop Nov 9, 2021

JunnYu deleted the update_gpt branch November 9, 2021 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PaddlePaddle Hackathon 56 提交 #1088

PaddlePaddle Hackathon 56 提交 #1088

JunnYu commented Sep 24, 2021 •

edited

Loading

yingyibiao commented Sep 27, 2021

JunnYu commented Sep 27, 2021

yingyibiao commented Oct 13, 2021 •

edited

Loading

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 19, 2021

ZHUI left a comment

ZHUI Oct 19, 2021

JunnYu Oct 22, 2021 •

edited

Loading

JunnYu Oct 27, 2021

JunnYu commented Oct 22, 2021

JunnYu commented Oct 22, 2021 •

edited

Loading

yingyibiao commented Oct 24, 2021

yingyibiao commented Oct 24, 2021

ZHUI left a comment

ZHUI Nov 1, 2021

ZeyuChen Nov 2, 2021

yingyibiao left a comment

		return logits


		class GPTForSequenceClassification(GPTPretrainedModel):


		基于`gpt-cpm-small-cn-distill`在MSRA的NER任务上Fine-tuning后，在验证集上有如下结果：

		Metric \| Result \|

PaddlePaddle Hackathon 56 提交 #1088

PaddlePaddle Hackathon 56 提交 #1088

Conversation

JunnYu commented Sep 24, 2021 • edited Loading

yingyibiao commented Sep 27, 2021

JunnYu commented Sep 27, 2021

yingyibiao commented Oct 13, 2021 • edited Loading

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 19, 2021

ZHUI left a comment

Choose a reason for hiding this comment

ZHUI Oct 19, 2021

Choose a reason for hiding this comment

JunnYu Oct 22, 2021 • edited Loading

Choose a reason for hiding this comment

JunnYu Oct 27, 2021

Choose a reason for hiding this comment

JunnYu commented Oct 22, 2021

JunnYu commented Oct 22, 2021 • edited Loading

yingyibiao commented Oct 24, 2021

yingyibiao commented Oct 24, 2021

ZHUI left a comment

Choose a reason for hiding this comment

ZHUI Nov 1, 2021

Choose a reason for hiding this comment

ZeyuChen Nov 2, 2021

Choose a reason for hiding this comment

yingyibiao left a comment

Choose a reason for hiding this comment

JunnYu commented Sep 24, 2021 •

edited

Loading

yingyibiao commented Oct 13, 2021 •

edited

Loading

JunnYu Oct 22, 2021 •

edited

Loading

JunnYu commented Oct 22, 2021 •

edited

Loading