We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 5ec1086 commit 2428aa1Copy full SHA for 2428aa1
vllm/model_executor/layers/fused_moe/modular_kernel.py
@@ -510,7 +510,7 @@ def workspace_shapes(
510
511
Inputs:
512
- M_chunk: current number of tokens due to chunking, otherwise same as
513
- M_full.
+ M_full, generally used for intermediate workspace shapes.
514
- M_full: full number of tokens, generally used to compute output shape.
515
- N: Row (or column) dimension of expert weights.
516
- K: hidden dimension
0 commit comments