Skip to content

Commit 0885ee7

Browse files
comaniacjimpang
authored andcommitted
[Kernel] Raise an exception in MoE kernel if the batch size is larger then 65k (vllm-project#5939)
1 parent 0fd7504 commit 0885ee7

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/model_executor/layers/fused_moe/fused_moe.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -423,6 +423,11 @@ def fused_experts(hidden_states: torch.Tensor,
423423
M, _ = hidden_states.shape
424424
E, N, _ = w1.shape
425425

426+
if M > 65536:
427+
# https://github.com/vllm-project/vllm/issues/5938
428+
raise ValueError("MoE kernel does not support more than 65536 tokens, "
429+
f"but got {M}")
430+
426431
if override_config:
427432
config = override_config
428433
else:

0 commit comments

Comments
 (0)