Support chatglm fine grained dybatch v1. #6798

xiaoxiaohehe001 · 2023-08-22T18:13:24Z

PR types

New features

PR changes

Models

Description

Support chatglm fine grained dybatch v1.

codecov · 2023-08-22T18:58:12Z

Codecov Report

Merging #6798 (2ffa396) into develop (1a69081) will decrease coverage by 0.24%.
The diff coverage is 0.00%.

@@             Coverage Diff             @@
##           develop    #6798      +/-   ##
===========================================
- Coverage    60.30%   60.06%   -0.24%     
===========================================
  Files          544      546       +2     
  Lines        80364    80680     +316     
===========================================
  Hits         48460    48460              
- Misses       31904    32220     +316

Files Changed	Coverage Δ
paddlenlp/experimental/transformers/__init__.py	`0.00% <0.00%> (ø)`
...enlp/experimental/transformers/chatglm/__init__.py	`0.00% <0.00%> (ø)`
...enlp/experimental/transformers/chatglm/modeling.py	`0.00% <0.00%> (ø)`
...erimental/transformers/fused_transformer_layers.py	`0.00% <0.00%> (ø)`
...enlp/experimental/transformers/generation_utils.py	`0.00% <0.00%> (ø)`

... and 1 file with indirect coverage changes

carryyu · 2023-08-23T06:31:42Z

paddlenlp/experimental/transformers/generation_utils.py

@@ -181,7 +209,7 @@ def update_model_kwargs_for_generation(cache, just_decoder, next_tokens, eos_tok
                model_kwargs["seq_len_decoder"],
                model_kwargs["seq_len_decoder"] + 1,
            )
-        return model_kwargs
+        return model_kwargs, next_tokens


是否需要返回next_tokens？

这里的 set_multi_stops 应该可以放到 sample 函数里面来处理吧，所以就可以不在这里返回 next_tokens。

尽量和paddlenlp 现有的函数输入和输出保持一致。

heavengate · 2023-08-23T07:36:07Z

llm/predictor.py

+                self.tgt_generation_mask[i, 0, 0, :length] = paddle.ones(shape=[1, length], dtype="float16")
+
+            inputs["attention_mask"] = self.attention_mask
+            inputs["tgt_generation_mask"] = self.tgt_generation_mask


这里看着主要的diff是tgt_pos的处理，只有chatglm有这个2d position的区别么，这里看看把这块也封装一下，通过参数is_2d_pos之类的来区分，在代码里直接通过chatglm判断不是很易扩展
@wj-Mcat @carryyu 也看下怎么封装好一些

商量了下，先暂时这个样子，等后面针对于 chatglm 的tokenizer 再调整一下，此时这里的分支代码就可以删掉了。

heavengate · 2023-08-23T07:37:57Z

llm/predictor.py

+
+                config.tensor_parallel_degree = tensor_parallel_degree
+                config.tensor_parallel_rank = tensor_parallel_rank
+                model = LlamaForCausalLMInferenceModel.from_pretrained(


这里看下能否走AutoModelForCausalLM那种方式，内部根据config去分发

目前还不能通过 AutoModelForCausalLM 来分发，所以初始化目前只能够 hardcode。

不过判断是哪种模型可以通过上面的 config.architectures 来判断。

heavengate · 2023-08-23T07:38:31Z

llm/predictor.py

+            "你好",
+            "你好啊，请问你叫什么名字",
+            "你好啊，你在干什么",
+            # "My name is?"


注释可以删一下

wj-Mcat · 2023-08-23T09:15:37Z

paddlenlp/experimental/transformers/generation_utils.py

@@ -158,19 +163,42 @@ def update_model_kwargs_for_generation(cache, just_decoder, next_tokens, eos_tok
        if cache is None:
            # encoder's generation
            model_kwargs["tgt_ids"] = paddle.where(just_decoder, model_kwargs["tgt_ids"], next_tokens)
-            model_kwargs["tgt_pos"] = paddle.where(just_decoder, model_kwargs["tgt_pos"], model_kwargs["tgt_pos"] + 1)
+            # import pdb;pdb.set_trace()


这里的 pdb 代码应该删掉

wj-Mcat · 2023-08-23T09:16:46Z

paddlenlp/experimental/transformers/generation_utils.py

@@ -181,7 +209,7 @@ def update_model_kwargs_for_generation(cache, just_decoder, next_tokens, eos_tok
                model_kwargs["seq_len_decoder"],
                model_kwargs["seq_len_decoder"] + 1,
            )
-        return model_kwargs
+        return model_kwargs, next_tokens


这里的 set_multi_stops 应该可以放到 sample 函数里面来处理吧，所以就可以不在这里返回 next_tokens。

尽量和paddlenlp 现有的函数输入和输出保持一致。

llm/predictor.py

wj-Mcat · 2023-08-23T09:26:54Z

llm/predictor.py

+
+                config.tensor_parallel_degree = tensor_parallel_degree
+                config.tensor_parallel_rank = tensor_parallel_rank
+                model = LlamaForCausalLMInferenceModel.from_pretrained(


目前还不能通过 AutoModelForCausalLM 来分发，所以初始化目前只能够 hardcode。

不过判断是哪种模型可以通过上面的 config.architectures 来判断。

llm/predictor.py

llm/utils.py

CLAassistant · 2023-08-25T03:20:09Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ xiaoxiaohehe001
❌ zhengzekang

zhengzekang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

wj-Mcat · 2023-08-25T07:02:21Z

llm/predictor.py

+                self.tgt_generation_mask[i, 0, 0, :length] = paddle.ones(shape=[1, length], dtype="float16")
+
+            inputs["attention_mask"] = self.attention_mask
+            inputs["tgt_generation_mask"] = self.tgt_generation_mask


商量了下，先暂时这个样子，等后面针对于 chatglm 的tokenizer 再调整一下，此时这里的分支代码就可以删掉了。

wj-Mcat · 2023-08-25T07:03:44Z

paddlenlp/experimental/transformers/__init__.py

@@ -12,5 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

+from .chatglm import *


这个的 import 应该是要放到 from .fused_transformer_layers import * 下面的。

wj-Mcat · 2023-08-25T07:22:26Z

paddlenlp/experimental/transformers/generation_utils.py

@@ -134,7 +139,7 @@ def generate(
        return ret

    @staticmethod
-    def update_model_kwargs_for_generation(cache, just_decoder, next_tokens, eos_token_id, model_kwargs):
+    def update_model_kwargs_for_generation(cache, just_decoder, next_tokens, eos_token_id, config, model_kwargs):


通过和 @xiaoxiaohehe001 商量之后，为了给模型足够的控制范围，决定将参数转化为实例函数(去掉 @staticmethod), 这样派生模型就可以通过 self.config 来获取到对应的配置，同时也可以重写对应函数。

wj-Mcat

LGTM

support_chatglm_fine

8a5164b

xiaoxiaohehe001 added 2 commits August 23, 2023 14:17

unify_fused_layer

b33aa65

fix_utils

f4d4578

carryyu reviewed Aug 23, 2023

View reviewed changes

heavengate reviewed Aug 23, 2023

View reviewed changes

wj-Mcat requested changes Aug 23, 2023

View reviewed changes

delete_model_type

2dc6e98

zhengzekang and others added 2 commits August 25, 2023 12:25

delete_set_state_dict_external

007cc38

fix_useless_log

3258050

wj-Mcat reviewed Aug 25, 2023

View reviewed changes

xiaoxiaohehe001 and others added 2 commits August 25, 2023 15:56

add_update_kwags_self

55d1f6d

Merge branch 'develop' into support_chatglm_dybatch_v1

2ffa396

wj-Mcat approved these changes Aug 28, 2023

View reviewed changes

sijunhe merged commit 7903bcc into PaddlePaddle:develop Aug 28, 2023
4 checks passed

wj-Mcat mentioned this pull request Sep 4, 2023

Supports chatglm dybatch V1. #6757

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support chatglm fine grained dybatch v1. #6798

Support chatglm fine grained dybatch v1. #6798

xiaoxiaohehe001 commented Aug 22, 2023

codecov bot commented Aug 22, 2023 •

edited

Loading

carryyu Aug 23, 2023

wj-Mcat Aug 23, 2023

xiaoxiaohehe001 Aug 25, 2023

heavengate Aug 23, 2023

wj-Mcat Aug 25, 2023

heavengate Aug 23, 2023

wj-Mcat Aug 23, 2023

heavengate Aug 23, 2023

xiaoxiaohehe001 Aug 25, 2023

wj-Mcat Aug 23, 2023

xiaoxiaohehe001 Aug 25, 2023

wj-Mcat Aug 23, 2023

wj-Mcat Aug 23, 2023

CLAassistant commented Aug 25, 2023

wj-Mcat Aug 25, 2023

wj-Mcat Aug 25, 2023

wj-Mcat Aug 25, 2023

wj-Mcat left a comment

Support chatglm fine grained dybatch v1. #6798

Support chatglm fine grained dybatch v1. #6798

Conversation

xiaoxiaohehe001 commented Aug 22, 2023

PR types

PR changes

Description

codecov bot commented Aug 22, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented Aug 25, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wj-Mcat left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 22, 2023 •

edited

Loading