-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Refactor Transformers backend to use mixins #26906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Transformers backend to use mixins #26906
Conversation
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and well-executed refactoring of the Transformers backend. It modularizes the backend by introducing mixin classes for different functionalities like multi-modality, Mixture-of-Experts (MoE), and pooling. This greatly improves code organization, maintainability, and extensibility. The new structure, with a dedicated transformers package, is much cleaner. I've found one potential high-severity issue related to the torch.compile configuration for multimodal models that could lead to incorrect behavior or errors.
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clean up, just one comment
|
Still running local testing, so no need to enable CI just yet |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
#24172 broke Qwen VL models for the Transformers backend because it relied on We will forward fix this. |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
It ended up not being too hard, I've fixed mrope for Transformers backend in this PR |
|
I've switched to the same Gemma 3 checkpoint used in the multimodal classifier test from your PR @DarkLight1337 |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
This PR refactors the Transformers backend to use mixin classes to add model specific functionality like multi-modality, mixture-of-experts or pooling.
This enables us to compose these functionalities to create new Transformers backend classes composing any of these features. The two new classes added by this PR are for multimodal embedding and multimodal sequence classification.
It also adds
__getattr__to__init__.pyso that if a user requests a new combination of these functionalities, they are prompted to create an issue asking us to add it.This PR enables #26715 by adding support for multimodal embedding.
This PR also: