-
-
Notifications
You must be signed in to change notification settings - Fork 7.3k
[Model]: get aria to work with the lastest transfomers impl #12207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: xffxff <1247714429@qq.com>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
@Isotr0py Could you please take a look? |
Can you update the example and test files as in #12203? |
That PR lets CI pass but the model isn't working correctly, so I prefer merging yours if it works. Thanks for updating this! |
@DarkLight1337 Sry, I didn't notice that you have already worked on this. The model works in my local environment. I can update the tests and examples in this PR. I will work on it tonight or the next day |
I have tested your PR, and it seems that your model has similar outputs as mine.
This isn't the expected output. (Note I'm using TP=4 locally) Taking Phi3V as an example, the expected output should be:
|
To avoid blocking CI, I'm going to merge #12203 first. Meanwhile we can use your PR to fix further issues with the model. |
This pull request has merge conflicts that must be resolved before it can be |
created an issue #12241 to track it |
Transformers 4.48 has integrated Aria, see huggingface/transformers#34157. We need to make some changes in vllm to ensure compatibility, as the transformers impl of Aria modifies the weight mappings of checkpoints. For example, the mapping for "multi_modal_projector.cross_attn.ln_kv.weight" has been changed to "multi_modal_projector.cross_attn.layer_norm_kv.weight."
Also, we can remove some configuration files related to Aria in vllm, as we can use those files directly from transformers