vllm 优化之 cuda_graph 详解 - Zhang #200
Replies: 2 comments
-
|
不错,加油! |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
链接失效了,我找到了最新的链接 https://www.armcvai.cn/2024-12-05/vllm-cuda-graph.html |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
vllm 优化之 cuda_graph 详解 - Zhang
从事 LLM 推理部署、视觉算法开发、模型压缩部署以及算法SDK开发工作,终身学习践行者。LLM_Infercuda graph 解决了可能存在的所有 CPU 开销的来源:如用户编写的逻辑、PyTorch 调度逻辑、内存分配开销以及 GPU 驱动/内核开销(静态图优势)。
https://www.armcvai.cn/2024-11-09/vllm-cuda-graph.html
Beta Was this translation helpful? Give feedback.
All reactions