MoRA Implementation #9562

lcykww · 2024-12-04T02:57:03Z

PR types

Others

PR changes

Others

Description

修改了paddlenlp/peft/lora/lora_layers.py，添加了mora方法的实现
修改了paddlenlp/peft/lora/lora_model.py，主要添加了mora冻结参数的逻辑
修改了paddlenlp/peft/lora/lora_config.py，加入了use_mora参数
在paddlenlp/trl/model_config.py中加入了use_mora参数
在llm/run_finetune.py中调用的lora_config添加了use_mora参数

测试效果如下：
测试模型：facebook/llama-7b
训练集：commonsense_170k
其它参数使用paddlenlp的默认值

paddle-bot · 2024-12-04T02:57:08Z

Thanks for your contribution!

codecov · 2024-12-04T03:30:11Z

Codecov Report

Attention: Patch coverage is 90.32258% with 9 lines in your changes missing coverage. Please review.

Project coverage is 52.68%. Comparing base (407b3e6) to head (6181b2b).
Report is 5 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/peft/lora/lora_layers.py	89.88%	9 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9562      +/-   ##
===========================================
+ Coverage    52.66%   52.68%   +0.02%     
===========================================
  Files          712      712              
  Lines       111691   111762      +71     
===========================================
+ Hits         58818    58886      +68     
- Misses       52873    52876       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

paddlenlp/peft/lora/lora_layers.py

lugimzzz · 2024-12-11T09:36:04Z

paddlenlp/peft/lora/lora_model.py

-                    elif "lora" in name:
-                        weight.stop_gradient = False
+                    if layer.use_mora:
+                        if self.lora_config.trainable_bias in ["lora_A", "all"] and "bias" in name:


为什么要加lora_A?

lugimzzz · 2024-12-11T09:36:34Z

tests/fixtures/llm/mora.yaml

@@ -0,0 +1,113 @@
+lora:


把lora改成mora吧

lugimzzz · 2024-12-11T09:36:56Z

tests/llm/test_mora.py

+        self.disable_static()
+        paddle.set_default_dtype("float32")
+
+        lora_config = load_test_config(self.config_path, "lora", self.model_dir)


把lora改成mora

lugimzzz · 2024-12-11T09:38:22Z

llm/tools/merge_mora_params.py

+                if isinstance(layer, paddle.nn.Linear) or isinstance(layer, QuantizationLinear):
+                    weight_process(name, quant_config, lora_config, model_state_dict, args.device)
+
+        # 待修改


lugimzzz · 2024-12-11T11:38:21Z

paddlenlp/peft/lora/lora_layers.py

@@ -65,11 +65,13 @@ def __init__(
        lora_plus_scale: float = 1.0,
        pissa: bool = False,
        lora_use_mixer: bool = False,
+        use_mora: bool = False,


https://github.com/kongds/MoRA/blob/main/peft-mora/src/peft/tuners/lora/layer.py#L122 这里scaling设为1

lugimzzz · 2024-12-11T11:48:16Z

paddlenlp/peft/lora/lora_layers.py

+
+        # create RoPE
+        if self.cos is None or self.sin is None:
+            inv_freq = 1.0 / (10000 ** (paddle.arange(0, r, 2, dtype=self._dtype) / r))


这里建议用float32计算

lugimzzz · 2024-12-11T11:49:23Z

paddlenlp/peft/lora/lora_layers.py

+
+        # apply RoPE rotation
+        rh_in_x = paddle.concat([-in_x[..., r // 2 :], in_x[..., : r // 2]], axis=-1)
+        # rh_in_x = paddle.cast(rh_in_x, dtype=paddle.paddle.bfloat16)


lugimzzz · 2024-12-11T11:50:06Z

paddlenlp/peft/lora/lora_layers.py

+        rh_in_x = paddle.concat([-in_x[..., r // 2 :], in_x[..., : r // 2]], axis=-1)
+        # rh_in_x = paddle.cast(rh_in_x, dtype=paddle.paddle.bfloat16)
+        in_x = in_x * self.cos + rh_in_x * self.sin
+        in_x = paddle.cast(in_x, dtype=self._dtype)


为什么还需要cast

lugimzzz · 2024-12-11T11:58:42Z

paddlenlp/peft/lora/lora_layers.py

+        rb2 = self.out_features // r if self.out_features % r == 0 else self.out_features // r + 1
+
+        # create RoPE
+        if self.cos is None or self.sin is None:


是不是可以写个函数初始化cos sin，然后在init调用一下，不要写太多处

lugimzzz · 2024-12-11T12:07:00Z

在Description贴一下复现结果

lugimzzz · 2024-12-16T04:07:00Z

paddlenlp/peft/lora/lora_layers.py

@@ -3,7 +3,7 @@
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
-#
+# mat


lugimzzz · 2024-12-16T04:09:08Z

paddlenlp/peft/lora/lora_layers.py

+                is_bias=False,
+                default_initializer=nn.initializer.Constant(value=0.0),
+            )
+            self.cos = None


这里初始化就行self.RoPE_init(r, rb1)

lugimzzz · 2024-12-16T08:50:56Z

llm/tools/merge_lora_params.py

-            out = (weight + lora_A @ lora_AB @ lora_B * scaling).cpu()
-        else:
-            out = (weight + lora_A @ lora_B * scaling).cpu()
+        delta_weight = layer.get_delta_weight()


这么写有问题，调用的是layer.lora_A不是前面的lora_A.layer中的lora_A还是在cpu上的，一方面cpu上计算太慢，另一方面cpu不支持bfloat16。建议这里面重写一个get_delta_weight，或把get_delta_weight写的适合外部调用,例如

def get_delta_weight(self, lora_A=None,lora_B=None): if use_mora: lora_A = lora_A if lora_A is not None else self.lora_A .... else: lora_A = lora_A if lora_A is not None else self.lora_A lora_B = lora_B if lora_B is not None else self.lora_B ...

lugimzzz · 2024-12-16T08:58:54Z

paddlenlp/peft/lora/lora_layers.py

@@ -144,21 +157,102 @@ def pissa_init(self, rank):
        weight = res.astype(dtype)
        self.weight.set_value(weight)

+    def RoPE_init(self, r, rb1):


最好都是小写比如def rope_init

lugimzzz

lgtm

lugimzzz

lgtm

MoRA Implementation

e8d72b6

paddle-bot bot added the contributor label Dec 4, 2024

paddle-bot bot assigned DesmonDay Dec 4, 2024

lugimzzz reviewed Dec 11, 2024

View reviewed changes

lcykww added 7 commits December 14, 2024 11:54

MoRA算法

a46fce2

MoRA

9100581

MoRA

77945bc

MoRA

0e420f3

MoRA

a91ae43

MoRA

c0e4416

MoRA

142cd3d

lugimzzz reviewed Dec 16, 2024

View reviewed changes

MoRA

46e3da3

lugimzzz previously approved these changes Dec 18, 2024

View reviewed changes

MoRA

5017004

lcykww dismissed lugimzzz’s stale review via 5017004 December 18, 2024 03:47

changed lora_config.py

6181b2b

lugimzzz approved these changes Dec 18, 2024

View reviewed changes

lugimzzz merged commit 90bc68e into PaddlePaddle:develop Dec 18, 2024
10 of 12 checks passed

lugimzzz added Beijing Innovation Consortium and removed contributor labels Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MoRA Implementation #9562

MoRA Implementation #9562

lcykww commented Dec 4, 2024 •

edited

Loading

paddle-bot bot commented Dec 4, 2024

codecov bot commented Dec 4, 2024 •

edited

Loading

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz Dec 11, 2024

lugimzzz commented Dec 11, 2024

lugimzzz Dec 16, 2024

lcykww Dec 16, 2024

lugimzzz Dec 16, 2024

lcykww Dec 17, 2024

lugimzzz Dec 16, 2024

lcykww Dec 17, 2024

lugimzzz Dec 16, 2024

lcykww Dec 16, 2024

lugimzzz left a comment

lugimzzz left a comment

MoRA Implementation #9562

MoRA Implementation #9562

Conversation

lcykww commented Dec 4, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Dec 4, 2024

codecov bot commented Dec 4, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lugimzzz commented Dec 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lugimzzz left a comment

Choose a reason for hiding this comment

lugimzzz left a comment

Choose a reason for hiding this comment

lcykww commented Dec 4, 2024 •

edited

Loading

codecov bot commented Dec 4, 2024 •

edited

Loading