【开源实习】blenderbot_small模型微调 #1980
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
分别实现了blenderbot_small模型在"stanfordnlp/coqa"和"google/Synthetic-Persona-Chat"数据集上的微调实验。
因为在coqa数据集上的效果并不好,所以又做了Persona的微调,Persona的微调还是不错的,这里都上传了供参考
任务链接在https://gitee.com/mindspore/community/issues/IAUPE8
transformers+pytorch+3090的benchmark是自己编写的,仓库位于https://github.com/outbreak-sen/blenderbot_small_finetuned
更改代码位于llm/finetune/blenderbot_small,只包含mindnlp+mindspore的
实验结果如下
Blenderbot_Small的coqa微调
硬件
资源规格:NPU: 1*Ascend-D910B(显存: 64GB), CPU: 24, 内存: 192GB
智算中心:武汉智算中心
镜像:mindspore_2_5_py311_cann8
torch训练硬件资源规格:Nvidia 3090
模型与数据集
模型:"facebook/blenderbot_small-90M"
数据集:"stanfordnlp/coqa"
训练损失
评估损失
Blenderbot_Small的Synthetic-Persona-Chat微调
硬件
资源规格:NPU: 1*Ascend-D910B(显存: 64GB), CPU: 24, 内存: 192GB
智算中心:武汉智算中心
镜像:mindspore_2_5_py311_cann8
torch训练硬件资源规格:Nvidia 3090
模型与数据集
模型:"facebook/blenderbot_small-90M"
数据集:"google/Synthetic-Persona-Chat"
训练损失
评估损失