-
-
Couldn't load subscription status.
- Fork 10.9k
For VLLM_USE_PRECOMPILED, only compiled .so files should be extracted #21964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the logic for handling pre-compiled wheels to extract only the compiled .so files, preventing Python file pollution, while still making the original wheel available for subsequent steps in a Docker build. The changes look good and align with the stated objectives. The code is now cleaner and more robust in how it extracts files from the wheel. There is one point of feedback regarding a hardcoded architecture tag, which could impact future maintainability and portability.
setup.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
9c85337 to
1223a5d
Compare
…tracted Fixes up use of precompiled wheels so only compiled CUDA kernel `.so` files are extracted and used during the build. Prevents `.py` file pollution from the wheel, while still enabling downstream layers in the Dockerfile to install the wheel (for deps like `torch`). Follow up to PR vllm-project#20943 Re-used prior extraction logic for .so's, and does so in such a manner that: - Extracts `.so` files to `/workspace/dist/` in Docker context - Copies `.whl` to `/workspace/dist/` for `uv pip install` - Keeps behavior consistent for local development (extracts only `.so`s to `./vllm`) Big thanks to Cyrus @DarkLight1337 for identifying this.
1223a5d to
3c7a2eb
Compare
This refactors the logic for handling `VLLM_USE_PRECOMPILED` to ensure that Docker builds extract only the required .so files and properly modify the package_data before setup() - Removes errant precompiled wheel copy that was copying old code - e.g. not code from the current checkout. - Moves precompiled wheel extraction logic into a utility class - Applies package_data patch before calling setup() - Now skips build_ext when precompiled is enabled - Supports fallback to nightly wheel if latest commit isn't available Follow-up to PR vllm-project#21964 as part of improving build times for CI. Signed-off-by: dougbtv <dosmith@redhat.com>
…xtracted (vllm-project#21964)" This reverts commit b9b753e.
…vllm-project#21964) Signed-off-by: x22x22 <wadeking@qq.com>
…vllm-project#21964) Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
…vllm-project#21964) Signed-off-by: Noam Gat <noamgat@gmail.com>
…vllm-project#21964) Signed-off-by: Paul Pak <paulpak58@gmail.com>
…vllm-project#21964) Signed-off-by: Diego-Castan <diego.castan@ibm.com>
Fixes up use of precompiled wheels so only compiled CUDA kernel
.sofiles are extracted and used during the build. Prevents.pyfile pollution from the wheel, while still enabling downstream layers in the Dockerfile to install the wheel (for deps liketorch).Follow up to PR #20943
Re-used prior extraction logic for .so's, and does so in such a manner that:
.sofiles to/workspace/dist/in Docker context.whlto/workspace/dist/foruv pip install.sos to./vllm)Big thanks to Cyrus @DarkLight1337 for identifying this.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
Fix how CI builds using pre-compiled wheels work, which currently is not reliable without this change.
Example, from Cyrus:
Test Plan
Test Result
(Optional) Documentation Update