[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` #37173

gante · 2025-04-01T14:40:05Z

What does this PR do?

In v4.45, we set in motion the removal of GenerationMixin inheritance by default in PreTrainedModel, our base model class. You can recap the full list of reasons in the original PR, but TL;DR removes circular dependencies and make non-generative models more lightweight.

This PR is the final step: removes the GenerationMixin inheritance. Note that this change is NOT breaking in most contexts:
✅ Loading generate-capable transformers models (#33203 added direct inheritance, #36180 added a meta test to ensure we always add generate tests to generate-capable models)
✅ Loading non generate-capable models
✅ Loading generate-capable Hub models with AutoModelXXX (#33203 added the logic to ensure we add GenerationMixin even if the original model is missing it, as well as corresponding tests)
❌ Loading generate-capable Hub models directly, i.e. not using AutoModelXXX, if and only if the model doesn't inherit from GenerationMixin (= old implementation). In this case, an informative warning is thrown, suggesting to load the model with AutoModelXXX. This warning was present since v4.45.

Relevant tests:

py.test tests/models -k test_generation_tester_mixin_inheritance
py.test tests/models/auto/ -k test_custom_model_patched_generation_inheritance
py.test tests/utils/test_modeling_utils.py::ModelUtilsTest::test_can_generate

After merging, let's keep an eye on issues. Although I think I've got the only breaking case well documented, Hub code is always a wildcard.

github-actions · 2025-04-01T14:40:18Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

HuggingFaceDocBuilderDev · 2025-04-01T15:21:52Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-04-01T15:51:44Z

src/transformers/models/speecht5/modeling_speecht5.py

    SPEECHT5_START_DOCSTRING,
 )
-class SpeechT5ForSpeechToText(SpeechT5PreTrainedModel):
+class SpeechT5ForSpeechToText(SpeechT5PreTrainedModel, GenerationMixin):


SpeechT5ForSpeechToText probably had some generate compatibility issues in the last versions, since its prepare_inputs_for_generation was not being updated

(removing the global mixin triggered test failures)

ArthurZucker

Have not fully followed but sounds good yep 🤗 remember we had some issues at the time, all stable now!

…Model` (huggingface#37173)

gante added 2 commits April 1, 2025 14:26

rm GenerationMixin inheritance

51de9be

old models: raise exception

c9e280d

github-actions bot marked this pull request as draft April 1, 2025 14:40

remove unneeded validations

234090b

gante marked this pull request as ready for review April 1, 2025 14:49

gante changed the title ~~[core] remove GenerationMixin inheritance by default~~ [core] remove GenerationMixin inheritance by default in PreTrainedModel Apr 1, 2025

github-actions bot requested review from ArthurZucker and Rocketknight1 April 1, 2025 14:49

Merge branch 'main' into rm_generation_mixin_inheritance_by_default

4950ccb

gante removed the request for review from Rocketknight1 April 1, 2025 14:53

fix speecht5

5053d08

gante commented Apr 1, 2025

View reviewed changes

gante added 2 commits April 2, 2025 10:49

corrections

3c5d470

nits

ade85f9

ArthurZucker approved these changes Apr 8, 2025

View reviewed changes

gante and others added 2 commits April 8, 2025 14:57

Merge branch 'main' into rm_generation_mixin_inheritance_by_default

b66cc26

rag should inherit the mixin

27b79e5

gante merged commit 4321b06 into huggingface:main Apr 8, 2025
20 checks passed

gante deleted the rm_generation_mixin_inheritance_by_default branch April 8, 2025 15:42

qgallouedec mentioned this pull request Apr 8, 2025

♾️ [CI] Remove test_raise_error_not_causallm huggingface/trl#3265

Merged

Cyrilvallez mentioned this pull request Apr 16, 2025

Add TimesFM Time Series Forecasting Model #34082

Merged

cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025

[core] remove GenerationMixin inheritance by default in `PreTrained…

9f6eb73

…Model` (huggingface#37173)

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025

[core] remove GenerationMixin inheritance by default in `PreTrained…

2bcee91

…Model` (huggingface#37173)

Jintao-Huang mentioned this pull request May 23, 2025

AttributeError: 'InternLM2ForCausalLM' object has no attribute 'generate'. transformers==4.52 modelscope/ms-swift#4324

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` #37173

[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` #37173

Uh oh!

gante commented Apr 1, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 1, 2025

Uh oh!

gante Apr 1, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[core] remove GenerationMixin inheritance by default in PreTrainedModel #37173

[core] remove GenerationMixin inheritance by default in PreTrainedModel #37173

Uh oh!

Conversation

gante commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 1, 2025

Uh oh!

gante Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` #37173

[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` #37173

gante commented Apr 1, 2025 •

edited

Loading