-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's up with Pipeline Parallelism? #3314
Comments
Currently we observe that the performance of Tensor Parallelism is more desirable than pipeline parallelism. Due to the lack of bandwidth, we dropped it from the current roadmap. We still welcome contribution! |
Thanks,I believe that Pipeline Parallelism may offer improved throughput compared to Tensor Parallelism, albeit with a trade-off in latency. In certain situations, this approach could indeed be more practical. Additionally, I am currently working on implementing an asynchronous version of Pipeline Parallelism, which I can make a PR upon completion. |
Our internal work shows PP is actually help improving throughput of prefill stage because of low communication cost. I am excited to see the proposal! |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Hey vllm team,
Hope you're all doing great! I‘m focusing on pipeline parallel inference and I hope it can be support on vllm.
I noticed that pipeline parallelism was on the old roadmap(#244) , but it's not on the new roadmap(#2681). Just curious, was there a specific reason you guys decided to skip it for now? Challenges with the implementation, or maybe it just didn't fit into the grand scheme of things at the moment?
Would love to get any insights or thoughts you have on this. I'm really looking forward to seeing where you take vllm next!
The text was updated successfully, but these errors were encountered: