[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` #28043

younesbelkada · 2023-12-14T15:02:39Z

What does this PR do?

Some users pass the config attribute to from_pretrained in order to modify model's hyperparameters to modify the undelrying architecture.

Note in previous versions before the attention refactor, it was possible to perform

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaForCausalLM, AutoConfig

model_id = "tiiuae/falcon-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
config = AutoConfig.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    config=config,
    torch_dtype=torch.bfloat16, 
    use_flash_attention_2="flash_attention_2",
    low_cpu_mem_usage=True,
)

Now users get an issue while trying to perform the operation above because the logic of handling model's config for fa2 changed a bit.

I propose a simple fix to mitigate this issue which is overwriting the attribute _attn_implementation of config only in case it has been passed by the user. I can confirm with this fix the snippet:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaForCausalLM, AutoConfig

model_id = "tiiuae/falcon-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
config = AutoConfig.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    config=config,
    torch_dtype=torch.bfloat16, 
    attn_implementation="flash_attention_2",
    low_cpu_mem_usage=True,
)

Works as expected as in the earlier versions of transformers

cc @amyeroberts @fxmarty

younesbelkada · 2023-12-14T15:03:55Z

I am not 100% sure this approach is correct cc @fxmarty does this looks good to you (as you took care of the attention refactor) ?

amyeroberts · 2023-12-14T15:21:33Z

I'm a bit concerned about this - this is effectively a patch inside from_pretrained to add backwards compatibility that should have already been handled. The main question this raises for me is whether there other FA parameters/behaviours we need to check?

Is it still possible to pass in both use_flash_attention_2 and config to from_pretrained? If not, it's not clear to me from the diff how this is addressed: use_flash_attention_2 isn't handled from the model kwargs.

Didn't do a final review on the recent refactor, so might be missing something. It's also not clear to me from just this PR why passing in a config would change whether or not I can pass in attn_implementation.

HuggingFaceDocBuilderDev · 2023-12-14T15:25:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

fxmarty · 2023-12-14T15:36:27Z

Good catch

@amyeroberts Even witout a fix, use_flash_attention_2=True along with a provided config IMO works thanks to

transformers/src/transformers/modeling_utils.py

Lines 1295 to 1299 in 050e0b4

    
           if use_flash_attention_2: 
        
               logger.warning_once( 
        
                   'The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.' 
        
               ) 
        
               config._attn_implementation = "flash_attention_2"

fxmarty · 2023-12-14T15:40:09Z

src/transformers/modeling_utils.py

+            # In case one passes a config to `from_pretrained` + "attn_implementation"
+            # override the `_attn_implementation` attribute to `attn_implementation` of the kwargs
+            # Please see: https://github.com/huggingface/transformers/issues/28038
+            if config is not None:
+                config._attn_implementation = model_kwargs.pop("attn_implementation", None)
+


Suggested change

# In case one passes a config to `from_pretrained` + "attn_implementation"

# override the `_attn_implementation` attribute to `attn_implementation` of the kwargs

# Please see: https://github.com/huggingface/transformers/issues/28038

if config is not None:

config._attn_implementation = model_kwargs.pop("attn_implementation", None)

# In case one passes a config to `from_pretrained` + "attn_implementation"

# override the `_attn_implementation` attribute to `attn_implementation` of the kwargs

# Please see: https://github.com/huggingface/transformers/issues/28038

config._attn_implementation = model_kwargs.pop("attn_implementation", None)

config is a PretrainedConfig here.

src/transformers/modeling_utils.py

fxmarty

LGTM, some tests are failing though

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

younesbelkada · 2023-12-14T16:15:41Z

src/transformers/modeling_utils.py

+            if kwargs.get("attn_implementation", None) is not None and getattr(
+                config, "_attn_implementation", None
+            ) != kwargs.get("attn_implementation", None):


This handles the case where users pass a config object to from_pretrained. Note AutoModelxxx.from_pretrained pops the attn_impelmentation from the kwargs in case one do not pass a config, but doesn't if we pass the config.

Therefore this handles this corner case as well (passing a config --> attn_implementation does not get popped + attn_implementation through from_pretrained kwargs). If that's the case we should over-write the config's attn_impelmentation by the one from the kwargs assuming the user knows what they are doing.

https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/auto_factory.py#L516-L540

younesbelkada · 2023-12-14T16:16:08Z

cc @amyeroberts @fxmarty requesting another round of review!

src/transformers/modeling_utils.py

fxmarty · 2023-12-14T17:09:58Z

tests/test_modeling_utils.py

@@ -1823,6 +1823,16 @@ def test_error_no_flash_available(self):

        self.assertTrue("does not support Flash Attention 2.0" in str(cm.exception))

+    def test_error_no_flash_available_with_config(self):


Can you add a test for e.g. llama + passing a config + attn_implementation="flash_attention_2 that the correct class is loaded?

You mean without AutoModel?

I mean a test for an architecture that do support FA2, passing both a config + attn_implementation="flash_attention_2"

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

amyeroberts

Thanks for fixing and @fxmarty for clarifying the case for use_flash_attention!

Just some nits

src/transformers/modeling_utils.py

amyeroberts · 2023-12-14T17:36:22Z

src/transformers/modeling_utils.py

+            # passes manually the config to `from_pretrained`.
+            config = copy.deepcopy(config)
+
+            if kwargs.get("attn_implementation", None) is not None and config._attn_implementation != kwargs.get(


Do we want to get here or pop from the kwargs?

Good catch, I think pop would work best here

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

…uggingface#28043) * fix fa-2 issue * fix test * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * clenaer fix * up * add more robust tests * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * fixup * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pop * add test --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

…28043) * fix fa-2 issue * fix test * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * clenaer fix * up * add more robust tests * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * fixup * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pop * add test --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

…uggingface#28043) * fix fa-2 issue * fix test * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * clenaer fix * up * add more robust tests * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * fixup * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pop * add test --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fix fa-2 issue

021a6ed

younesbelkada mentioned this pull request Dec 14, 2023

Cannot specify config and attn_implementation simultaneously #28038

Closed

4 tasks

younesbelkada requested review from fxmarty and amyeroberts December 14, 2023 15:03

fxmarty reviewed Dec 14, 2023

View reviewed changes

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

fxmarty approved these changes Dec 14, 2023

View reviewed changes

younesbelkada and others added 5 commits December 14, 2023 15:56

fix test

19ef983

Update src/transformers/modeling_utils.py

cfb2f4e

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

clenaer fix

dec3754

up

a2c566e

add more robust tests

a9be74d

younesbelkada commented Dec 14, 2023

View reviewed changes

younesbelkada requested a review from fxmarty December 14, 2023 16:15

Merge remote-tracking branch 'upstream/main' into fix-fa-2-from-config

8160f44

fxmarty reviewed Dec 14, 2023

View reviewed changes

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

fxmarty reviewed Dec 14, 2023

View reviewed changes

younesbelkada and others added 2 commits December 14, 2023 18:11

Update src/transformers/modeling_utils.py

8476ea1

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

fixup

6571ec8

amyeroberts approved these changes Dec 14, 2023

View reviewed changes

younesbelkada and others added 3 commits December 14, 2023 19:28

Update src/transformers/modeling_utils.py

f45eb42

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

pop

9068a89

add test

4d0366e

younesbelkada merged commit 1e20931 into huggingface:main Dec 15, 2023
21 checks passed

younesbelkada deleted the fix-fa-2-from-config branch December 15, 2023 10:08

amyeroberts mentioned this pull request Dec 15, 2023

Need to explicitly set use_reentrant when calling checkpoint #26969

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` #28043

[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` #28043

younesbelkada commented Dec 14, 2023

younesbelkada commented Dec 14, 2023

amyeroberts commented Dec 14, 2023

HuggingFaceDocBuilderDev commented Dec 14, 2023

fxmarty commented Dec 14, 2023 •

edited

Loading

fxmarty Dec 14, 2023

fxmarty left a comment •

edited

Loading

younesbelkada Dec 14, 2023

younesbelkada commented Dec 14, 2023

fxmarty Dec 14, 2023 •

edited

Loading

younesbelkada Dec 14, 2023

fxmarty Dec 15, 2023

younesbelkada Dec 15, 2023

amyeroberts left a comment

amyeroberts Dec 14, 2023

younesbelkada Dec 14, 2023

younesbelkada Dec 14, 2023

		@@ -1823,6 +1823,16 @@ def test_error_no_flash_available(self):

		self.assertTrue("does not support Flash Attention 2.0" in str(cm.exception))

		def test_error_no_flash_available_with_config(self):

[FA-2] Fix fa-2 issue when passing config to from_pretrained #28043

[FA-2] Fix fa-2 issue when passing config to from_pretrained #28043

Conversation

younesbelkada commented Dec 14, 2023

What does this PR do?

younesbelkada commented Dec 14, 2023

amyeroberts commented Dec 14, 2023

HuggingFaceDocBuilderDev commented Dec 14, 2023

fxmarty commented Dec 14, 2023 • edited Loading

Choose a reason for hiding this comment

fxmarty left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younesbelkada commented Dec 14, 2023

fxmarty Dec 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` #28043

[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` #28043

fxmarty commented Dec 14, 2023 •

edited

Loading

fxmarty left a comment •

edited

Loading

fxmarty Dec 14, 2023 •

edited

Loading