Fix conversion mappings for vlms by Cyrilvallez · Pull Request #45340 · huggingface/transformers

Cyrilvallez · 2026-04-09T12:02:14Z

What does this PR do?

Supersedes #45314 with a better fix.
Fixes #45216 and #45310 and #45313

HuggingFaceDocBuilderDev · 2026-04-09T12:18:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* fix * oupsi typo * missing dot * why did it have a mapping?... hub weights are correct already * no kw arg for replace...

BenjaminBossan · 2026-04-09T15:10:03Z

@Cyrilvallez This PR breaks the weight conversion with PEFT. Unfortunately, this was not caught by CI because the corresponding test is gated against PEFT v0.19.0, which is not out yet. However, when commenting out these two lines:

transformers/tests/peft_integration/test_peft_integration.py

Lines 936 to 937 in 83c3672

    
           if version.parse(importlib.metadata.version("peft")) < version.parse("0.19.0"): 
        
               self.skipTest("For this test to pass, PEFT 0.19 is required.")

and then running:

RUN_SLOW=1 pytest tests/peft_integration/test_peft_integration.py -k test_mixtral_lora_conversion

we get:

ValueError: Target module MixtralTopKRouter() is not supported. Currently, only the following modules are supported: torch.nn.Linear, torch.nn.Embedding, torch.nn.Conv1d, torch.nn.Conv2d, torch.nn.Conv3d, transformers.pytorch_utils.Conv1D, torch.nn.MultiheadAttention..

I tried understanding where that comes from based on the diff but I couldn't figure it out. Do you know what could be amiss here?

githubnemo · 2026-04-09T15:30:30Z

Hey @Cyrilvallez :)

I think that removing "mixtral" from the conversion mapping is the culprit. While mixtral is the reference implementation and 'up-to-date' this doesn't mean that old checkpoints don't need conversion.

@BenjaminBossan already confirmed offline that re-introducing this to the mapping resolves the issue.

Cyrilvallez · 2026-04-09T15:51:44Z

Hey @BenjaminBossan @githubnemo! Are you checking the internal variable _MODEL_TO_CONVERSION_PATTERN directly? 🤔🤔 This does not contain all mappings for all models at all, you should check the full mapping, i.e. the output of _build_checkpoint_conversion_mapping()!

BenjaminBossan · 2026-04-09T15:55:40Z

Yes, here:

https://github.com/huggingface/peft/blob/98465930f7c9666ff952f4c67893620a9ef1e2c3/src/peft/utils/transformers_weight_conversion.py#L351

So we should replace that with

base_model_type = _build_checkpoint_conversion_mapping().get(model_type, None)

?

Cyrilvallez · 2026-04-10T10:05:46Z

@BenjaminBossan Looking at your code, I think what you wanted to do is find all models with similar mapping to mixtral? If so, you can keep checking _MODEL_TO_CONVERSION_PATTERN I guess, but then also add mixtral explicitly in your code - we don't want to awkwardly add mixtral: mixtral to our internal mapping 😅

BenjaminBossan · 2026-04-10T10:18:08Z

The intent is to identify those models that require a conversion, as the PEFT config must be updated accordingly (e.g. target module foo is now bar).

find all models with similar mapping to mixtral

I'm not quite sure what "similar" means here. Do you mean the same model_type?

See huggingface/transformers#45340 (comment)

Cyrilvallez · 2026-04-17T08:25:29Z

@BenjaminBossan if you want to find all conversions, simply check _build_checkpoint_conversion_mapping() - but it's not what the code you showed me is doing

* fix * oupsi typo * missing dot * why did it have a mapping?... hub weights are correct already * no kw arg for replace...

Cyrilvallez added 3 commits April 9, 2026 14:01

fix

f9891fd

oupsi typo

3afb8a1

missing dot

1d94997

zucchini-nlp approved these changes Apr 9, 2026

View reviewed changes

why did it have a mapping?... hub weights are correct already

2d02212

Cyrilvallez mentioned this pull request Apr 9, 2026

[BUG] transformers>=5.4.0, Qwen3.5 Moe from_pretrained error #45310

Closed

4 tasks

no kw arg for replace...

44bcb6e

Cyrilvallez merged commit 0a66862 into main Apr 9, 2026
29 checks passed

Cyrilvallez deleted the fix-mappings branch April 9, 2026 13:17

Cyrilvallez added a commit that referenced this pull request Apr 9, 2026

Fix conversion mappings for vlms (#45340)

2cf8ff0

* fix * oupsi typo * missing dot * why did it have a mapping?... hub weights are correct already * no kw arg for replace...

johnking0099 mentioned this pull request Apr 10, 2026

[Regression] Qwen3.5 save_pretrained still saves incorrect visual encoder keys in 5.5.3 #45357

Closed

4 tasks

BenjaminBossan added a commit to BenjaminBossan/peft that referenced this pull request Apr 10, 2026

FIX Explicit weight conversion map for Mixtral

4c2da3f

See huggingface/transformers#45340 (comment)

BenjaminBossan mentioned this pull request Apr 10, 2026

FIX Explicit weight conversion map for Mixtral huggingface/peft#3146

Merged

Jintao-Huang mentioned this pull request Apr 14, 2026

qwen3.5_4b 全参训练推理问题 modelscope/ms-swift#9101

Closed

1 task

BenjaminBossan added a commit to huggingface/peft that referenced this pull request Apr 14, 2026

FIX Explicit weight conversion map for Mixtral (#3146)

076214c

See huggingface/transformers#45340 (comment)

Jintao-Huang mentioned this pull request Apr 15, 2026

Qwen3.5 lora merge export using wrong keys modelscope/ms-swift#9046

Closed

1 task

sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026

Fix conversion mappings for vlms (huggingface#45340)

45f63cd

* fix * oupsi typo * missing dot * why did it have a mapping?... hub weights are correct already * no kw arg for replace...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix conversion mappings for vlms#45340

Fix conversion mappings for vlms#45340
Cyrilvallez merged 5 commits intomainfrom
fix-mappings

Cyrilvallez commented Apr 9, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2026

Uh oh!

Uh oh!

BenjaminBossan commented Apr 9, 2026

Uh oh!

githubnemo commented Apr 9, 2026

Uh oh!

Cyrilvallez commented Apr 9, 2026

Uh oh!

BenjaminBossan commented Apr 9, 2026

Uh oh!

Cyrilvallez commented Apr 10, 2026

Uh oh!

BenjaminBossan commented Apr 10, 2026

Uh oh!

Cyrilvallez commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Cyrilvallez commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2026

Uh oh!

Uh oh!

BenjaminBossan commented Apr 9, 2026

Uh oh!

githubnemo commented Apr 9, 2026

Uh oh!

Cyrilvallez commented Apr 9, 2026

Uh oh!

BenjaminBossan commented Apr 9, 2026

Uh oh!

Cyrilvallez commented Apr 10, 2026

Uh oh!

BenjaminBossan commented Apr 10, 2026

Uh oh!

Cyrilvallez commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Cyrilvallez commented Apr 9, 2026 •

edited

Loading