Allow any vLLM engine args as env vars, Update vLLM, refactor #82

alpayariyak · 2024-07-02T20:03:25Z

Simplifying update process and making code cleaner by removing the hardcoding of engine arguments, defaults and env vars and instead matching env vars to vLLM's AsyncEngineArgs directly by keys

nerdylive123 · 2024-07-03T03:04:44Z

Yeah nice! using mapping, and can you update the VLLM base image, they have updated it 😊

…building separate image from vLLM fork

alpayariyak · 2024-07-25T19:44:04Z

I've started a new position, so @pandyamarut will be taking over.
left to do:

update 0.5.1 -> 0.5.3, ensure everything works
finalize tensorizer usage, small bugs left
update documentation to include the 40+ new available args
update UI input form

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

TimPietrusky · 2024-07-26T23:20:47Z

@alpayariyak thank you!

@pandyamarut welcome!

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

TimPietrusky · 2024-07-29T11:01:31Z

@pandyamarut Update: now it worked just by repeating the command again. So you can forget my other message :D

Previous message

I tried to build this image locally based on this branch, but got this error:

884.7 Collecting flashinfer
885.7   Downloading https://github.com/flashinfer-ai/flashinfer/releases/download/v0.1.1/flashinfer-0.1.1%2Bcu121torch2.3-cp310-cp310-linux_x86_64.whl (1262.5 MB)
1034.9      ━━━━━━━━━━━━━━━━━━━━━╸                   0.7/1.3 GB 4.1 MB/s eta 0:02:23
1035.5 ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
1035.5     flashinfer from https://github.com/flashinfer-ai/flashinfer/releases/download/v0.1.1/flashinfer-0.1.1%2Bcu121torch2.3-cp310-cp310-linux_x86_64.whl#sha256=90da45996eefaf82ff77a53c8dcb813415cddbcfc9981e20fcbc33660e019429:
1035.5         Expected sha256 90da45996eefaf82ff77a53c8dcb813415cddbcfc9981e20fcbc33660e019429
1035.5              Got        8e74fee1baf4e9e896c479f85844d31c2ee251fa4cf0d5b778a7d9aac9fca8c5
1035.5
------
Dockerfile:15
--------------------
  14 |     # Install vLLM (switching back to pip installs since issues that required building fork are fixed and space optimization is not as important since caching) and FlashInfer
  15 | >>> RUN python3 -m pip install vllm==0.5.3.post1 && \
  16 | >>>     python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3
  17 |
--------------------
ERROR: failed to solve: process "/bin/sh -c python3 -m pip install vllm==0.5.3.post1 &&     python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3" did not complete successfully: exit code: 1

I was using this command:

docker build -t runpod/worker-vllm:dev  --platform linux/amd64 .

I'm running this on Windows, but I'm not sure if this should affect anything.

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

Allow any vLLM engine args as env vars, refactor

a08d83f

0.5.3, any vllm arg as env var, refactor and fixes, moving away from …

5bd6f3a

…building separate image from vLLM fork

alpayariyak force-pushed the any-arg-and-refactor branch from ee846b1 to 5bd6f3a Compare July 25, 2024 19:41

alpayariyak changed the title ~~Allow any vLLM engine args as env vars, refactor~~ Allow any vLLM engine args as env vars, Update vLLM, refactor Jul 25, 2024

alpayariyak assigned alpayariyak and pandyamarut and unassigned alpayariyak Jul 25, 2024

alpayariyak marked this pull request as draft July 25, 2024 19:45

update v0.5.3.post1

bd96b5e

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

pandyamarut added 2 commits July 26, 2024 16:52

update env default args

0f8657e

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

Delete test.py

b61ea5e

TimPietrusky mentioned this pull request Jul 27, 2024

Support for tools / tool_choice="auto" in OpenAI-compatible API #85

Open

2 tasks

fix openai compat

f3534a4

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

pandyamarut added 3 commits July 30, 2024 17:09

Update README.md

0814d76

Update README.md

e1b4179

update docker

14cacd5

Signed-off-by: pandyamarut <pandyamarut@gmail.com>

pandyamarut marked this pull request as ready for review August 1, 2024 22:23

pandyamarut merged commit 8a010c3 into main Aug 1, 2024
2 checks passed

pandyamarut deleted the any-arg-and-refactor branch August 1, 2024 22:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow any vLLM engine args as env vars, Update vLLM, refactor #82

Allow any vLLM engine args as env vars, Update vLLM, refactor #82

alpayariyak commented Jul 2, 2024

nerdylive123 commented Jul 3, 2024

alpayariyak commented Jul 25, 2024 •

edited

Loading

TimPietrusky commented Jul 26, 2024

TimPietrusky commented Jul 29, 2024 •

edited

Loading

Allow any vLLM engine args as env vars, Update vLLM, refactor #82

Allow any vLLM engine args as env vars, Update vLLM, refactor #82

Conversation

alpayariyak commented Jul 2, 2024

nerdylive123 commented Jul 3, 2024

alpayariyak commented Jul 25, 2024 • edited Loading

TimPietrusky commented Jul 26, 2024

TimPietrusky commented Jul 29, 2024 • edited Loading

Previous message

alpayariyak commented Jul 25, 2024 •

edited

Loading

TimPietrusky commented Jul 29, 2024 •

edited

Loading