-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CTranslate2 #211
Comments
Thanks for bringing this up. We will investigate the CTranslate2 library and evaluate the difficulty and the potential benefit of adding it into vLLM. |
Would love to see this, ct2 would be a great integration! It would give us easy access to fast 8 bit inference and plays nice with HF Transformers. Thank you for the library so far!! |
Hi, Any news regarding this integration? Ctranslate2 has already proven its speed within the TitanML framework for local LLM serving. |
hi, any news on this? |
+1 |
+11 |
@zhuohan123 do you see any benefit of adding this to vLLM? |
Upstream sync 2024 04 26 (neuralmagic#211) SUMMARY: Merge commits from vllm-project@a37d815 to vllm-project@b6dcb4d Note that vllm-project@a37d815 is NOT included in this merge. --------- Signed-off-by: Tao He <sighingnow@gmail.com> Co-authored-by: SangBin Cho <rkooo567@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> Co-authored-by: Bellk17 <Kyletbell@ymail.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: Jee Li <pandaleefree@163.com> Co-authored-by: Dylan Hawk <51147702+dylanwhawk@users.noreply.github.com> Co-authored-by: zspo <songpo.zhang@foxmail.com> Co-authored-by: Sanger Steel <sangersteel@gmail.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Roy <jasonailu87@gmail.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Ricky Xu <xuchen727@hotmail.com> Co-authored-by: Noam Gat <noamgat@gmail.com> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> Co-authored-by: Cade Daniel <edacih@gmail.com> Co-authored-by: Elinx <xizzuli@163.com> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Shoichi Uchinami <s.uchinami@gmail.com> Co-authored-by: SangBin Cho <sangcho@sangcho-LT93GQWG9C.local> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Alexandre Payot <alexandrep@graphcore.ai> Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: Michał Moskal <michal@moskal.me> Co-authored-by: James Whedbee <jamesw@telnyx.com> Co-authored-by: Liangfu Chen <liangfc@amazon.com> Co-authored-by: Adam Tilghman <agt@ucsd.edu> Co-authored-by: Uranus <109661872+UranusSeven@users.noreply.github.com> Co-authored-by: Zhong Wang <wangzhong@infini-ai.com> Co-authored-by: Ronen Schaffer <ronen.schaffer@ibm.com> Co-authored-by: Chirag Jain <jain.chirag925@gmail.com> Co-authored-by: Ayush Rautwar <42046470+ayusher@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-13-147.ec2.internal> Co-authored-by: Harry Mellor <hmellor@oxts.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: nunjunj <106306814+nunjunj@users.noreply.github.com> Co-authored-by: xiaoji <44150358+YeFD@users.noreply.github.com> Co-authored-by: GeauxEric <yunding.eric@gmail.com> Co-authored-by: Yun Ding <yunding@nvidia.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Co-authored-by: Isotr0py <41363108+Isotr0py@users.noreply.github.com> Co-authored-by: Tao He <sighingnow@gmail.com> Co-authored-by: alexm-nm <59768536+alexm-nm@users.noreply.github.com> Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com> Co-authored-by: Jack Gordley <jgordley99@gmail.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: James Fleming <jaemz@alum.mit.edu> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Philipp Moritz <pcmoritz@gmail.com> Co-authored-by: zifeitong <zifeitong@gmail.com> Co-authored-by: Caio Mendes <caioctmendes@gmail.com> Co-authored-by: Alexei-V-Ivanov-AMD <156011006+Alexei-V-Ivanov-AMD@users.noreply.github.com> Co-authored-by: Caio Mendes <caiocesart@microsoft.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Closing because if there was significant benefit, it would have been discussed more or even implemented by now. |
Hello,
Thanks for the great framework for deploying LLM.
Would it be possible to use a LLM model compiled with the CTranslate2 library?
The text was updated successfully, but these errors were encountered: