Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fit sharding optimization for auto parallel llama #8021

Merged

Conversation

From00
Copy link
Collaborator

@From00 From00 commented Feb 26, 2024

PR types

Bug fixes

PR changes

Models

Description

静半Llama组网适配sharding优化相关开关,包括:
data_parallel_config:新增dp相关优化开关,目前支持enable_allreduce_avg_in_gradinent_scale配置,用于使用allreduce_avg通信算子做dp通信同步。框架相关实现PR:PaddlePaddle/Paddle#61622
sharding_parallel_config:适配enable_stage1_tensor_fusionenable_stage1_overlapenable_stage2_overlap配置,其中enable_stage1_tensor_fusion用于开启sharding通信fusion优化,enable_stage1_overlapenable_stage2_overlap用于开启sharding通信overlap优化。目前静半下通信fusion和overlap策略均只在stage2下才会起作用,stage1没有相关实现。

Copy link

paddle-bot bot commented Feb 26, 2024

Thanks for your contribution!

Copy link

codecov bot commented Feb 26, 2024

Codecov Report

Attention: Patch coverage is 7.69231% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 56.54%. Comparing base (b7bfb26) to head (e89375b).
Report is 3 commits behind head on develop.

Files Patch % Lines
paddlenlp/trainer/training_args.py 7.69% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8021      +/-   ##
===========================================
- Coverage    56.55%   56.54%   -0.01%     
===========================================
  Files          592      592              
  Lines        91036    91067      +31     
===========================================
+ Hits         51484    51493       +9     
- Misses       39552    39574      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@JZ-LIANG JZ-LIANG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@ZHUI ZHUI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有空可以也写一下中文文档,在这里 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/docs/trainer.md?plain=1#L540-L553 先approve了

Copy link
Collaborator

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit e3cb5d2 into PaddlePaddle:develop Mar 5, 2024
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants