Monkeypatch for Phi3 #76

tyler-romero · 2024-08-24T22:10:57Z

Summary

Add a new monkeypatch function to support patching Huggingface's Phi3 implementation with Liger Kernels.

Phi3 has its own MLP implementation (Phi3MLP) so a LigerPhi3SwiGLUMLP implementation that leverages LigerSiLUMulFunction is provided as well.

Testing Done

run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Convergence test added (and passing on my 4090) for a minimodel based on Phi3 patched with liger kernels.
All tests passing.

Questions for Discussion

Apparently Phi3 was only added in transformers v4.41, but the lowest supported version of transformers in Liger-Kernel is 4.40.1. Additionally, only more recently has sdpa been supported in HF Phi3. Thoughts? Should I leave the transformers dependency version as-is?

ByronHsu

cc @shivam15s

tyler-romero · 2024-08-24T22:35:57Z

README.md

@@ -153,6 +153,8 @@ loss.backward()
 | Mixtral     | `liger_kernel.transformers.apply_liger_kernel_to_mixtral`  | RoPE, RMSNorm, SwiGLU, CrossEntropyLoss        |
 | Gemma2      | `liger_kernel.transformers.apply_liger_kernel_to_gemma`    | RoPE, RMSNorm, GeGLU, CrossEntropyLoss         |
 | Qwen2       | `liger_kernel.transformers.apply_liger_kernel_to_qwen2`    | RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy        |
+| Phi3        | `liger_kernel.transformers.apply_liger_kernel_to_phi3`    | RoPE, RMSNorm, SwiGLU, CrossEntropyLoss         |


Sorry for the whitespace modifications here, this is the only change in this file

thanks a lot for helping on the housekeeping! <3

do you want to implement for FusedLinearCrossEntropy in the next PR and also create a issue to track? Essentially add a method patch

Liger-Kernel/src/liger_kernel/transformers/model/llama.py

Line 25 in 8aab06a

def lce_forward(

Yeah happy to do that as well

setup.py

tyler-romero · 2024-08-24T22:37:09Z

src/liger_kernel/transformers/swiglu.py

@@ -38,3 +38,27 @@ def __init__(self, config):
    def forward(self, x):

        return self.w2(LigerSiLUMulFunction.apply(self.w1(x), self.w3(x)))
+
+
+class LigerPhi3SwiGLUMLP(nn.Module):


So Llama has its own MLP implementation here, but its named very generally. I went for a model-specific name here, but open to suggestions.

yeah. this is necessary

yundai424

Thanks for this hyper speed PR woo hoo

test/transformers/test_swiglu.py

Makefile

ByronHsu · 2024-08-26T16:46:13Z

Apparently Phi3 was only added in transformers v4.41, but the lowest supported version of transformers in Liger-Kernel is 4.40.1. Additionally, only more recently has huggingface/transformers#32457. Thoughts? Should I leave the transformers dependency version as-is?

yeah let's bump to 4.41. sdpa seems not a hard requirement since most users use flash attn

ByronHsu · 2024-08-26T16:56:00Z

we can merge after transformers bump and fix conflict

tyler-romero · 2024-08-26T17:14:48Z

make test && make test-convergence
python -m pytest --disable-warnings test/ --ignore=test/convergence
=================================================================================================== test session starts ====================================================================================================
platform linux -- Python 3.10.13, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/tromero/workspace/Liger-Kernel
plugins: devtools-0.12.2
collected 138 items                                                                                                                                                                                                        

test/transformers/test_cross_entropy.py ........................................................ss                                                                                                                   [ 42%]
test/transformers/test_fused_linear_cross_entropy.py ......                                                                                                                                                          [ 46%]
test/transformers/test_geglu.py ........                                                                                                                                                                             [ 52%]
test/transformers/test_rms_norm.py ................................                                                                                                                                                  [ 75%]
test/transformers/test_rope.py ............                                                                                                                                                                          [ 84%]
test/transformers/test_swiglu.py ................                                                                                                                                                                    [ 95%]
test/transformers/test_trainer_integration.py ...                                                                                                                                                                    [ 97%]
test/transformers/test_transformers_monkey_patch.py .                                                                                                                                                                [ 98%]
test/triton/test_triton_monkey_patch.py ..                                                                                                                                                                           [100%]

======================================================================================== 136 passed, 2 skipped in 77.79s (0:01:17) =========================================================================================
HF_DATASETS_OFFLINE=1 python -m pytest --disable-warnings test/convergence
=================================================================================================== test session starts ====================================================================================================
platform linux -- Python 3.10.13, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/tromero/workspace/Liger-Kernel
plugins: devtools-0.12.2
collected 12 items                                                                                                                                                                                                         

test/convergence/test_mini_models.py F.........                                                                                                                                                                      [ 83%]
test/convergence/test_mini_models_no_logits.py ..                                                                                                                                                                    [100%]

========================================================================================================= FAILURES =========================================================================================================
_____________________________________________________________________ test_mini_model[mini_gemma-32-0.0001-dtype0-1e-08-1e-05-0.005-1e-05-0.005-1e-05] _____________________________________________________________________

All relevant tests passing, mini_gemma test convergence is failing on my 4090 now though, perhaps related to the recent PR (#85):

>           raise AssertionError("\n".join(mismatch_details))
E           AssertionError: Number of mismatched elements: 3
E           Mismatch at index (0, 29): tensor1[(0, 29)] = 0.3853762447834015, tensor2[(0, 29)] = 0.38537222146987915
E           Mismatch at index (0, 30): tensor1[(0, 30)] = 0.7438472509384155, tensor2[(0, 30)] = 0.7438388466835022
E           Mismatch at index (0, 31): tensor1[(0, 31)] = 0.5316240191459656, tensor2[(0, 31)] = 0.5316107869148254

tyler-romero · 2024-08-26T19:22:13Z

I think I'll need a review from @yundai424 as well to bypass the "changes requested" flag

lancerts · 2024-08-26T19:31:18Z

@yundai424 can you take a look? ty

ByronHsu

lgtm and we can fix gemma in #92

tyler-romero added 4 commits August 24, 2024 15:05

Monkeypatch for Phi3

0055dc8

checkstyle

859b5d5

some cleanup

b80b319

Test for LigerPhi3SwiGLUMLP

853b3f5

ByronHsu reviewed Aug 24, 2024

View reviewed changes

Update Readme

cb6f109

tyler-romero commented Aug 24, 2024

View reviewed changes

setup.py Outdated Show resolved Hide resolved

tyler-romero commented Aug 24, 2024

View reviewed changes

tyler-romero marked this pull request as ready for review August 24, 2024 22:37

lancerts requested a review from shivam15s August 25, 2024 02:18

yundai424 requested changes Aug 25, 2024

View reviewed changes

test/transformers/test_swiglu.py Outdated Show resolved Hide resolved

tyler-romero added 4 commits August 25, 2024 13:42

Merge branch 'main' into tyler/monkeypatch-phi3

e11c2d6

Address PR nit

ae9e060

Checkstyle

95b0ca2

Correctly resolve test.utils dir for make test command

5baa8ee

tyler-romero commented Aug 25, 2024

View reviewed changes

Makefile Show resolved Hide resolved

tyler-romero requested a review from yundai424 August 25, 2024 21:33

ByronHsu mentioned this pull request Aug 26, 2024

[feat] FusedLinearCrossEntropy support for phi-3 #98

Closed

tyler-romero added 2 commits August 26, 2024 10:04

Merge branch 'main' into tyler/monkeypatch-phi3

2162490

Bump transformers version

f001ff5

tyler-romero requested a review from ByronHsu August 26, 2024 17:15

Bump transformers version in README

b617c77

tyler-romero mentioned this pull request Aug 26, 2024

Add FusedLinerCrossEntropy support for Phi3 #103

Merged

3 tasks

Merge branch 'main' into tyler/monkeypatch-phi3

c5210f7

ByronHsu approved these changes Aug 26, 2024

View reviewed changes

yundai424 approved these changes Aug 27, 2024

View reviewed changes

yundai424 merged commit ee2dacb into linkedin:main Aug 27, 2024
1 check passed

DocShotgun mentioned this pull request Aug 27, 2024

Update supported models for Liger Kernel axolotl-ai-cloud/axolotl#1875

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monkeypatch for Phi3 #76

Monkeypatch for Phi3 #76

tyler-romero commented Aug 24, 2024 •

edited

Loading

ByronHsu left a comment

tyler-romero Aug 24, 2024

yundai424 Aug 25, 2024

ByronHsu Aug 26, 2024

tyler-romero Aug 26, 2024

ByronHsu Aug 26, 2024

tyler-romero Aug 24, 2024

ByronHsu Aug 26, 2024

yundai424 left a comment

ByronHsu commented Aug 26, 2024 •

edited

Loading

ByronHsu commented Aug 26, 2024

tyler-romero commented Aug 26, 2024

tyler-romero commented Aug 26, 2024

lancerts commented Aug 26, 2024

ByronHsu left a comment

Monkeypatch for Phi3 #76

Monkeypatch for Phi3 #76

Conversation

tyler-romero commented Aug 24, 2024 • edited Loading

Summary

Testing Done

Questions for Discussion

ByronHsu left a comment

Choose a reason for hiding this comment

tyler-romero Aug 24, 2024

Choose a reason for hiding this comment

yundai424 Aug 25, 2024

Choose a reason for hiding this comment

ByronHsu Aug 26, 2024

Choose a reason for hiding this comment

tyler-romero Aug 26, 2024

Choose a reason for hiding this comment

ByronHsu Aug 26, 2024

Choose a reason for hiding this comment

tyler-romero Aug 24, 2024

Choose a reason for hiding this comment

ByronHsu Aug 26, 2024

Choose a reason for hiding this comment

yundai424 left a comment

Choose a reason for hiding this comment

ByronHsu commented Aug 26, 2024 • edited Loading

ByronHsu commented Aug 26, 2024

tyler-romero commented Aug 26, 2024

tyler-romero commented Aug 26, 2024

lancerts commented Aug 26, 2024

ByronHsu left a comment

Choose a reason for hiding this comment

tyler-romero commented Aug 24, 2024 •

edited

Loading

ByronHsu commented Aug 26, 2024 •

edited

Loading