Fix couples of issues from #36335 #36453

SunMarc · 2025-02-27T14:28:04Z

What does this PR do?

This PR fixes a couple of issues seen from this PR. Here's a list:

We can't load bin file anymore if the model is on meta device -> not compatible with quantization also
Issue with fetching submodules (diffusers and peft issue)
allocation issue (cc @gante )
disk offload issue that we see in bnb tests
need to guard torch import Dtensor support requires torch>=2.5.1 #36472

Issues remaining for follow-up PRs

maybe check how we can deal with renamed keys better. It's a big mess right now. (cc @Cyrilvallez with your refactor PR).
deepspeed issue
probably more since lots of code were modified

To reproduce errors coming from peft CI :

from transformers import AutoModelForCausalLM
import torch
model_ids = [
    "facebook/opt-125m",
    "facebook/opt-350m",
    "facebook/opt-6.7b",
]
device_maps = [None, 0, "auto"]

for device_map in device_maps:
    for model_id in model_ids:
        try:
            model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device_map)
            print(model.model.decoder.embed_tokens.weight)
            print(f"Model {model_id} with device_map {device_map} loaded successfully")
        except AttributeError as e:
            print(f"Model {model_id} with device_map {device_map} failed to load with error: {e}")

allocation issue

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

DEVICE = "cuda"
MODEL_ID = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

torch.cuda.synchronize()
max_memory = torch.cuda.max_memory_allocated(DEVICE) * 1e-6
print("Before loading -- Max memory (MB): ", max_memory)
torch.cuda.reset_peak_memory_stats(DEVICE)


model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map="auto", torch_dtype=torch.float16)
torch.cuda.synchronize()
max_memory = torch.cuda.max_memory_allocated(DEVICE) * 1e-6
print("After loading -- Max memory (MB): ", max_memory)
torch.cuda.reset_peak_memory_stats(DEVICE)


tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
inputs = tokenizer(["The quick brown"], return_tensors="pt").to(model.device)
_ = model(**inputs)
torch.cuda.synchronize()
max_memory = torch.cuda.max_memory_allocated(DEVICE) * 1e-6
print("After forward -- Max memory (MB): ", max_memory)
torch.cuda.reset_peak_memory_stats(DEVICE)

github-actions · 2025-02-27T14:28:17Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

HuggingFaceDocBuilderDev · 2025-02-27T14:54:52Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-02-28T10:23:25Z

@SunMarc if possible, add tests to prevent regressions 🙏

SunMarc · 2025-02-28T10:34:54Z

@SunMarc if possible, add tests to prevent regressions 🙏

I think we had a lot of failing tests due to that PR actually, not sure how we missed them @ydshieh @muellerzr . But happy to add maybe more fast tests with this is what is missing.

SunMarc · 2025-02-28T18:57:04Z

failing tests are not related to this PR but I found out that is was also due to #36335. Need to fix

ArthurZucker

Thanks !

src/transformers/modeling_utils.py

SunMarc · 2025-02-28T22:45:42Z

I don't think the CI will pass so can you merge it @ArthurZucker ?

* fix * style * better allocation * fix * fix * style * revert disk * exit * style * return if nothing to cache * dtensor guard * fix regressiion * fix regression * fix * fix

SunMarc added 2 commits February 27, 2025 15:22

fix

5c11db3

style

5e678f6

github-actions bot marked this pull request as draft February 27, 2025 14:28

SunMarc changed the title ~~Fix meta loading~~ Fix couples of issues from #36335 Feb 27, 2025

SunMarc marked this pull request as ready for review February 27, 2025 14:32

SunMarc requested a review from ArthurZucker February 27, 2025 14:32

SunMarc mentioned this pull request Feb 27, 2025

Bug introduced in _load_state_dict_into_meta_model and to v4.49.0..v4.50.0.dev #36441

Closed

SunMarc and others added 10 commits February 27, 2025 16:05

better allocation

3e3525f

fix

531a63d

fix

895c7e9

style

db53262

revert disk

e34caa9

exit

8f478f6

style

b721b31

return if nothing to cache

3fb9125

dtensor guard

f6fdb13

Merge branch 'main' into fix-meta-loading

bd2c95c

SunMarc and others added 4 commits February 28, 2025 15:02

Merge branch 'main' into fix-meta-loading

c4dcb73

fix regressiion

88ca0b3

fix regression

e8dcc0e

Merge branch 'main' into fix-meta-loading

d4902d2

ArthurZucker approved these changes Feb 28, 2025

View reviewed changes

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

SunMarc added 2 commits February 28, 2025 23:11

fix

a527278

fix

0d178eb

ArthurZucker merged commit a40f1ac into main Mar 1, 2025
20 of 24 checks passed

ArthurZucker deleted the fix-meta-loading branch March 1, 2025 06:12

psychedelicious mentioned this pull request Apr 3, 2025

chore: support python 3.12, torch 2.6.0, clean up build/deps invoke-ai/InvokeAI#7873

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix couples of issues from #36335 #36453

Fix couples of issues from #36335 #36453

Uh oh!

SunMarc commented Feb 27, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Feb 27, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 27, 2025

Uh oh!

gante commented Feb 28, 2025

Uh oh!

SunMarc commented Feb 28, 2025 •

edited

Loading

Uh oh!

SunMarc commented Feb 28, 2025 •

edited

Loading

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc commented Feb 28, 2025

Uh oh!

Uh oh!

Uh oh!

Fix couples of issues from #36335 #36453

Fix couples of issues from #36335 #36453

Uh oh!

Conversation

SunMarc commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Issues remaining for follow-up PRs

To reproduce errors coming from peft CI :

Uh oh!

github-actions bot commented Feb 27, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 27, 2025

Uh oh!

gante commented Feb 28, 2025

Uh oh!

SunMarc commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc commented Feb 28, 2025

Uh oh!

Uh oh!

Uh oh!

SunMarc commented Feb 27, 2025 •

edited

Loading

SunMarc commented Feb 28, 2025 •

edited

Loading

SunMarc commented Feb 28, 2025 •

edited

Loading