Skip to content

Conversation

@gante
Copy link
Contributor

@gante gante commented Apr 1, 2025

What does this PR do?

In v4.45, we set in motion the removal of GenerationMixin inheritance by default in PreTrainedModel, our base model class. You can recap the full list of reasons in the original PR, but TL;DR removes circular dependencies and make non-generative models more lightweight.

This PR is the final step: removes the GenerationMixin inheritance. Note that this change is NOT breaking in most contexts:
✅ Loading generate-capable transformers models (#33203 added direct inheritance, #36180 added a meta test to ensure we always add generate tests to generate-capable models)
✅ Loading non generate-capable models
✅ Loading generate-capable Hub models with AutoModelXXX (#33203 added the logic to ensure we add GenerationMixin even if the original model is missing it, as well as corresponding tests)
❌ Loading generate-capable Hub models directly, i.e. not using AutoModelXXX, if and only if the model doesn't inherit from GenerationMixin (= old implementation). In this case, an informative warning is thrown, suggesting to load the model with AutoModelXXX. This warning was present since v4.45.

Relevant tests:

  • py.test tests/models -k test_generation_tester_mixin_inheritance
  • py.test tests/models/auto/ -k test_custom_model_patched_generation_inheritance
  • py.test tests/utils/test_modeling_utils.py::ModelUtilsTest::test_can_generate

After merging, let's keep an eye on issues. Although I think I've got the only breaking case well documented, Hub code is always a wildcard.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 1, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@github-actions github-actions bot marked this pull request as draft April 1, 2025 14:40
@gante gante marked this pull request as ready for review April 1, 2025 14:49
@gante gante changed the title [core] remove GenerationMixin inheritance by default [core] remove GenerationMixin inheritance by default in PreTrainedModel Apr 1, 2025
@gante gante removed the request for review from Rocketknight1 April 1, 2025 14:53
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SPEECHT5_START_DOCSTRING,
)
class SpeechT5ForSpeechToText(SpeechT5PreTrainedModel):
class SpeechT5ForSpeechToText(SpeechT5PreTrainedModel, GenerationMixin):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpeechT5ForSpeechToText probably had some generate compatibility issues in the last versions, since its prepare_inputs_for_generation was not being updated

(removing the global mixin triggered test failures)

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have not fully followed but sounds good yep 🤗 remember we had some issues at the time, all stable now!

@gante gante merged commit 4321b06 into huggingface:main Apr 8, 2025
20 checks passed
@gante gante deleted the rm_generation_mixin_inheritance_by_default branch April 8, 2025 15:42
cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants