Skip to content

Commit 7b81659

Browse files
ElizaWszoladsikka
authored andcommitted
[Kernel] Zero point support in fused MarlinMoE kernel + AWQ Fused MoE (vllm-project#8973)
Co-authored-by: Dipika <dipikasikka1@gmail.com> Co-authored-by: Dipika Sikka <ds3822@columbia.edu> Signed-off-by: Amit Garg <mitgarg17495@gmail.com>
1 parent bbe5488 commit 7b81659

23 files changed

+969
-223
lines changed

CMakeLists.txt

+2
Original file line numberDiff line numberDiff line change
@@ -433,6 +433,8 @@ if(VLLM_GPU_LANG STREQUAL "CUDA")
433433
"csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu"
434434
"csrc/moe/marlin_kernels/marlin_moe_kernel_ku8b128.h"
435435
"csrc/moe/marlin_kernels/marlin_moe_kernel_ku8b128.cu"
436+
"csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.h"
437+
"csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu"
436438
"csrc/moe/marlin_moe_ops.cu")
437439

438440
set_gencode_flags_for_srcs(

0 commit comments

Comments
 (0)