Skip to content

Fix conversion mappings for vlms#45340

Merged
Cyrilvallez merged 5 commits intomainfrom
fix-mappings
Apr 9, 2026
Merged

Fix conversion mappings for vlms#45340
Cyrilvallez merged 5 commits intomainfrom
fix-mappings

Conversation

@Cyrilvallez
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez commented Apr 9, 2026

What does this PR do?

Supersedes #45314 with a better fix.
Fixes #45216 and #45310 and #45313

cc @zucchini-nlp

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Cyrilvallez Cyrilvallez merged commit 0a66862 into main Apr 9, 2026
29 checks passed
@Cyrilvallez Cyrilvallez deleted the fix-mappings branch April 9, 2026 13:17
Cyrilvallez added a commit that referenced this pull request Apr 9, 2026
* fix

* oupsi typo

* missing dot

* why did it have a mapping?... hub weights are correct already

* no kw arg for replace...
@BenjaminBossan
Copy link
Copy Markdown
Member

@Cyrilvallez This PR breaks the weight conversion with PEFT. Unfortunately, this was not caught by CI because the corresponding test is gated against PEFT v0.19.0, which is not out yet. However, when commenting out these two lines:

if version.parse(importlib.metadata.version("peft")) < version.parse("0.19.0"):
self.skipTest("For this test to pass, PEFT 0.19 is required.")

and then running:

RUN_SLOW=1 pytest tests/peft_integration/test_peft_integration.py -k test_mixtral_lora_conversion

we get:

ValueError: Target module MixtralTopKRouter() is not supported. Currently, only the following modules are supported: torch.nn.Linear, torch.nn.Embedding, torch.nn.Conv1d, torch.nn.Conv2d, torch.nn.Conv3d, transformers.pytorch_utils.Conv1D, torch.nn.MultiheadAttention..

I tried understanding where that comes from based on the diff but I couldn't figure it out. Do you know what could be amiss here?

@githubnemo
Copy link
Copy Markdown
Contributor

Hey @Cyrilvallez :)

I think that removing "mixtral" from the conversion mapping is the culprit. While mixtral is the reference implementation and 'up-to-date' this doesn't mean that old checkpoints don't need conversion.

@BenjaminBossan already confirmed offline that re-introducing this to the mapping resolves the issue.

@Cyrilvallez
Copy link
Copy Markdown
Member Author

Hey @BenjaminBossan @githubnemo! Are you checking the internal variable _MODEL_TO_CONVERSION_PATTERN directly? 🤔🤔 This does not contain all mappings for all models at all, you should check the full mapping, i.e. the output of _build_checkpoint_conversion_mapping()!

@BenjaminBossan
Copy link
Copy Markdown
Member

Yes, here:

https://github.com/huggingface/peft/blob/98465930f7c9666ff952f4c67893620a9ef1e2c3/src/peft/utils/transformers_weight_conversion.py#L351

So we should replace that with

base_model_type = _build_checkpoint_conversion_mapping().get(model_type, None)

?

@Cyrilvallez
Copy link
Copy Markdown
Member Author

@BenjaminBossan Looking at your code, I think what you wanted to do is find all models with similar mapping to mixtral? If so, you can keep checking _MODEL_TO_CONVERSION_PATTERN I guess, but then also add mixtral explicitly in your code - we don't want to awkwardly add mixtral: mixtral to our internal mapping 😅

@BenjaminBossan
Copy link
Copy Markdown
Member

The intent is to identify those models that require a conversion, as the PEFT config must be updated accordingly (e.g. target module foo is now bar).

find all models with similar mapping to mixtral

I'm not quite sure what "similar" means here. Do you mean the same model_type?

@Cyrilvallez
Copy link
Copy Markdown
Member Author

@BenjaminBossan if you want to find all conversions, simply check _build_checkpoint_conversion_mapping() - but it's not what the code you showed me is doing

sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
* fix

* oupsi typo

* missing dot

* why did it have a mapping?... hub weights are correct already

* no kw arg for replace...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Regression] Qwen3.5 saved checkpoint is not correct with save_pretrained API since version 5.4.0

5 participants