-
Notifications
You must be signed in to change notification settings - Fork 31.3k
[core] remove GenerationMixin inheritance by default in PreTrainedModel
#37173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] remove GenerationMixin inheritance by default in PreTrainedModel
#37173
Conversation
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
GenerationMixin inheritance by default in PreTrainedModel
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| SPEECHT5_START_DOCSTRING, | ||
| ) | ||
| class SpeechT5ForSpeechToText(SpeechT5PreTrainedModel): | ||
| class SpeechT5ForSpeechToText(SpeechT5PreTrainedModel, GenerationMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpeechT5ForSpeechToText probably had some generate compatibility issues in the last versions, since its prepare_inputs_for_generation was not being updated
(removing the global mixin triggered test failures)
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have not fully followed but sounds good yep 🤗 remember we had some issues at the time, all stable now!
What does this PR do?
In v4.45, we set in motion the removal of
GenerationMixininheritance by default inPreTrainedModel, our base model class. You can recap the full list of reasons in the original PR, but TL;DR removes circular dependencies and make non-generative models more lightweight.This PR is the final step: removes the
GenerationMixininheritance. Note that this change is NOT breaking in most contexts:✅ Loading
generate-capabletransformersmodels (#33203 added direct inheritance, #36180 added a meta test to ensure we always addgeneratetests togenerate-capable models)✅ Loading non
generate-capable models✅ Loading
generate-capable Hub models withAutoModelXXX(#33203 added the logic to ensure we addGenerationMixineven if the original model is missing it, as well as corresponding tests)❌ Loading
generate-capable Hub models directly, i.e. not usingAutoModelXXX, if and only if the model doesn't inherit fromGenerationMixin(= old implementation). In this case, an informative warning is thrown, suggesting to load the model withAutoModelXXX. This warning was present since v4.45.Relevant tests:
py.test tests/models -k test_generation_tester_mixin_inheritancepy.test tests/models/auto/ -k test_custom_model_patched_generation_inheritancepy.test tests/utils/test_modeling_utils.py::ModelUtilsTest::test_can_generateAfter merging, let's keep an eye on issues. Although I think I've got the only breaking case well documented, Hub code is always a wildcard.