Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【pir save 】Modiy export llama model file in pir mode #8689

Merged
merged 6 commits into from
Jul 2, 2024

Conversation

xiaoguoguo626807
Copy link
Contributor

@xiaoguoguo626807 xiaoguoguo626807 commented Jul 1, 2024

PR types

Others

PR changes

Others

Description

pcard-67164
修改多处代码支持在pir模式下对llama-2-7b模型导出

  1. 动转静下遇到动态shape 无法导出,需要将paddlenlp/transformers/llama/modeling.py 中关于attn_weights.shape 的判断代码在动转静下跳过。因为动态图运行此处可以拦截错误,动转静不会出现问题。
  2. 当pad_token_id = None 时,PIR下不允许传递给full_like 的value 是none,此处逻辑不完备,generate 函数中会判断如果没有pad_token_id 时将pad_token_id 设置为eos_token_id
  3. PIR下没有print op 且op相关的方法也不同。需要进行分支处理

Copy link

paddle-bot bot commented Jul 1, 2024

Thanks for your contribution!

Copy link

codecov bot commented Jul 1, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 55.62%. Comparing base (be5bb14) to head (e86f5bc).
Report is 231 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/generation/utils.py 50.00% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8689   +/-   ##
========================================
  Coverage    55.61%   55.62%           
========================================
  Files          620      620           
  Lines        96965    96991   +26     
========================================
+ Hits         53930    53949   +19     
- Misses       43035    43042    +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -1038,6 +1042,7 @@ def greedy_search(
synced_gpus=False,
**model_kwargs
):
pad_token_id = self.set_pad_token_id(pad_token_id, eos_token_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def generate已经 set_pad_token_id过了,这里也就不用再次设置了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1143,6 +1148,7 @@ def sample(
synced_gpus=False,
**model_kwargs
):
pad_token_id = self.set_pad_token_id(pad_token_id, eos_token_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit d832282 into PaddlePaddle:develop Jul 2, 2024
8 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants