-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLM] Add tensor parallel for chatglmv2 #9014
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #9014 +/- ##
===========================================
+ Coverage 53.34% 53.79% +0.45%
===========================================
Files 650 652 +2
Lines 105414 104710 -704
===========================================
+ Hits 56230 56329 +99
+ Misses 49184 48381 -803 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
__all__ = [ | ||
"ChatGLMv2Model", | ||
"ChatGLMv2PretrainedModel", | ||
"ChatGLMv2ForCausalLM", | ||
] | ||
|
||
|
||
def seed_guard_context(name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个问题出现的多吗?
* fix_chatglmv2_8k
* fix_chatglmv2_8k
PR types
Bug fixes
PR changes
Models
Description
fix chatglmv2 8k
在8k情况下,zero_padding=true, max_sequence_length=8192, src_length=4096
Sharding_parallel_degree=8, sharding=stage3, recompute=True, step=9时, OOM
Sharding_parallel_degree=8, sharding=stage3, recompute=False, step=1时, OOM
train_runtime: 293.3676, train_samples_per_second: 0.409, train_steps_per_second: 0.1023, train_loss: 9.233072916666666, progress_or_epoch: 0.0647
Effective_Tokens_per_second: 3322.318257026338 内存:430G
train_runtime: 141.7329, train_samples_per_second: 0.8467, train_steps_per_second: 0.2117, train_loss: 9.233072916666666, progress_or_epoch: 0.0647
Effective_Tokens_per_second: 6876.741628090584 内存:320G