Skip to content

Commit 9d38d2b

Browse files
dsikkaElizaWszola
authored andcommitted
[Kernel] Expand MoE weight loading + Add Fused Marlin MoE Kernel (vllm-project#7766)
Co-authored-by: ElizaWszola <eliza@neuralmagic.com> Signed-off-by: LeiWang1999 <leiwang1999@outlook.com>
1 parent df3ac9d commit 9d38d2b

File tree

16 files changed

+2382
-85
lines changed

16 files changed

+2382
-85
lines changed

CMakeLists.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,11 @@ set(VLLM_MOE_EXT_SRC
296296
"csrc/moe/torch_bindings.cpp"
297297
"csrc/moe/topk_softmax_kernels.cu")
298298

299+
if(VLLM_GPU_LANG STREQUAL "CUDA")
300+
list(APPEND VLLM_MOE_EXT_SRC
301+
"csrc/moe/marlin_moe_ops.cu")
302+
endif()
303+
299304
define_gpu_extension_target(
300305
_moe_C
301306
DESTINATION vllm

0 commit comments

Comments
 (0)