-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI][Installation] Avoid uploading CUDA 11.8 wheel #10535
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Please don't merge yet. |
The cu11 wheel should have the right suffix in the wheel name. Something must changed... |
Signed-off-by: simon-mo <simon.mo@hey.com>
testing my changes here https://buildkite.com/vllm/release/builds/1929 |
seems to work.
https://buildkite.com/vllm/release/builds/1929#0193506f-2078-4ddc-954d-ec3a9df8134c/120-3878 |
Just out of curiosity, is there a public link to access the cu11.8 wheel? |
Signed-off-by: simon-mo <simon.mo@hey.com> Co-authored-by: simon-mo <simon.mo@hey.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: simon-mo <simon.mo@hey.com> Co-authored-by: simon-mo <simon.mo@hey.com> Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
@simon-mo How can we get wheel for CUDA 11.8 now? |
Signed-off-by: simon-mo <simon.mo@hey.com> Co-authored-by: simon-mo <simon.mo@hey.com>
I found the latest wheel from
https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
is built from CUDA 11.8 which is inconsistent with documentation https://docs.vllm.ai/en/latest/getting_started/installation.html#install-the-latest-codeI think this is due to https://github.com/vllm-project/vllm/blob/main/.buildkite/upload-wheels.sh#L36-L38. Since both the CUDA 11.8 and 12.1 release pipelines will use this script to upload the wheel to S3. The pipeline that completes later will overwrite the previous upload. Then it may be random whether CUDA 11.8 or CUDA 12.1 is used, depending on which version builds faster.
Fix it by not uploading the wheel of CUDA 11.8
cc @simon-mo