-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
[Perf] Further optimization for Qwen3-VL fast_pos_embed_interpolate
#25347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
fast_pos_embed_interpolatefast_pos_embed_interpolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a minor but effective optimization for the fast_pos_embed_interpolate function in Qwen3-VL's vision transformer. The changes involve refactoring the weight and index calculations for bilinear interpolation. By using torch.meshgrid, the code becomes more readable and vectorized. Furthermore, an algebraic simplification in the weight calculation reduces the number of multiplications, leading to a small but measurable performance improvement as shown in the provided benchmarks. The changes are correct and well-implemented. I have no major concerns.
fast_pos_embed_interpolatefast_pos_embed_interpolate
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: charlifu <charlifu@amd.com>
…#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: yewentao256 <zhyanwentao@126.com>
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…vllm-project#25347) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Purpose
fast_pos_embed_interpolate#25337fast_pos_embed_interpolateby reducing duplicated multiply.weightsandindicescomputation withtorch.meshgridvectorizationTest Plan
Test Result
Main branch
PR
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.