-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[Frontend]Reduce vLLM's import time #15128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
This pull request has merge conflicts that must be resolved before it can be |
|
This pull request has merge conflicts that must be resolved before it can be |
|
@davidxia Could you please run the pre-commit hook? I noticed some lint issues that need to be fixed. |
8396105 to
36a0bc1
Compare
|
I reran the benchmarks with the same setup. The main branch's times are here. commit 36a0bc1e5bf5621ab42accca24ee454d9e18583e
|
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
python -c "import vllm" |
3.504 ± 0.119 | 3.306 | 3.891 | 1.00 |
vllm --version
$ hyperfine 'vllm --version' --warmup 3 --runs 20 --export-markdown out.md
Benchmark 1: vllm --version
Time (mean ± σ): 9.853 s ± 0.329 s [User: 9.643 s, System: 0.973 s]
Range (min … max): 9.515 s … 10.599 s 20 runs| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
vllm --version |
9.853 ± 0.329 | 9.515 | 10.599 | 1.00 |
|
rebased and resolved conflicts |
2826010 to
5eec6ca
Compare
This change optimizes the import time of `import vllm` and contributes to vllm-project#14924. Most of the changes are to lazily instead of eagerly import expensive modules. This change shouldn't affect core functionality. Co-authored-by: Chen-0210 <chenjincong11@gmail.com> Co-authored-by: David Xia <david@davidxia.com> Signed-off-by: Chen-0210 <chenjincong11@gmail.com> Signed-off-by: David Xia <david@davidxia.com>
|
This pull request has merge conflicts that must be resolved before it can be |
|
Looks like conflicts again. I can help fix again, but just wanted to know from maintainers if this PR is still something you think is worth doing and maybe a cursory review before we spend too much time on this. |
|
lol, Just tested the latest code, since currently, it will load more and more things.. |
|
I would suggest to try at smaller scale iteratively, let's start at multimodal directory? |
Sounds good. I'll make a separate PR soon. |
|
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
|
This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you! |
This PR optimizes the import time of
from vllm import LLMand fix this issue #14924The majority of the changes only involve reordering the import statements. SoI think this change will not affect the core functionality and can reduce import time.
Comparison
time python3 -c "import vllm"Before:
After:
The time is mainly costed in two parts
import torchtakes 1.5–2stime vllm -vBefore:
After:
time vllm -vcan be optimized invllm/entrypoints/cli, and want to implemente it in a separate pr.