Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow any vLLM engine args as env vars, Update vLLM, refactor #82

Merged
merged 9 commits into from
Aug 1, 2024

Conversation

alpayariyak
Copy link
Contributor

Simplifying update process and making code cleaner by removing the hardcoding of engine arguments, defaults and env vars and instead matching env vars to vLLM's AsyncEngineArgs directly by keys

@nerdylive123
Copy link

Yeah nice! using mapping, and can you update the VLLM base image, they have updated it 😊

@alpayariyak alpayariyak changed the title Allow any vLLM engine args as env vars, refactor Allow any vLLM engine args as env vars, Update vLLM, refactor Jul 25, 2024
@alpayariyak
Copy link
Contributor Author

alpayariyak commented Jul 25, 2024

I've started a new position, so @pandyamarut will be taking over.
left to do:

  • update 0.5.1 -> 0.5.3, ensure everything works
  • finalize tensorizer usage, small bugs left
  • update documentation to include the 40+ new available args
  • update UI input form

@alpayariyak alpayariyak marked this pull request as draft July 25, 2024 19:45
Signed-off-by: pandyamarut <pandyamarut@gmail.com>
@TimPietrusky
Copy link

@alpayariyak thank you!

@pandyamarut welcome!

Signed-off-by: pandyamarut <pandyamarut@gmail.com>
Signed-off-by: pandyamarut <pandyamarut@gmail.com>
@TimPietrusky
Copy link

TimPietrusky commented Jul 29, 2024

@pandyamarut Update: now it worked just by repeating the command again. So you can forget my other message :D

Previous message

I tried to build this image locally based on this branch, but got this error:

884.7 Collecting flashinfer
885.7   Downloading https://github.com/flashinfer-ai/flashinfer/releases/download/v0.1.1/flashinfer-0.1.1%2Bcu121torch2.3-cp310-cp310-linux_x86_64.whl (1262.5 MB)
1034.9      ━━━━━━━━━━━━━━━━━━━━━╸                   0.7/1.3 GB 4.1 MB/s eta 0:02:23
1035.5 ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
1035.5     flashinfer from https://github.com/flashinfer-ai/flashinfer/releases/download/v0.1.1/flashinfer-0.1.1%2Bcu121torch2.3-cp310-cp310-linux_x86_64.whl#sha256=90da45996eefaf82ff77a53c8dcb813415cddbcfc9981e20fcbc33660e019429:
1035.5         Expected sha256 90da45996eefaf82ff77a53c8dcb813415cddbcfc9981e20fcbc33660e019429
1035.5              Got        8e74fee1baf4e9e896c479f85844d31c2ee251fa4cf0d5b778a7d9aac9fca8c5
1035.5
------
Dockerfile:15
--------------------
  14 |     # Install vLLM (switching back to pip installs since issues that required building fork are fixed and space optimization is not as important since caching) and FlashInfer
  15 | >>> RUN python3 -m pip install vllm==0.5.3.post1 && \
  16 | >>>     python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3
  17 |
--------------------
ERROR: failed to solve: process "/bin/sh -c python3 -m pip install vllm==0.5.3.post1 &&     python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3" did not complete successfully: exit code: 1

I was using this command:

docker build -t runpod/worker-vllm:dev  --platform linux/amd64 .

I'm running this on Windows, but I'm not sure if this should affect anything.

Signed-off-by: pandyamarut <pandyamarut@gmail.com>
@pandyamarut pandyamarut marked this pull request as ready for review August 1, 2024 22:23
@pandyamarut pandyamarut merged commit 8a010c3 into main Aug 1, 2024
2 checks passed
@pandyamarut pandyamarut deleted the any-arg-and-refactor branch August 1, 2024 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants