Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] change benchmark script so that result can be directly used; tune moe kernel in A100/H100 with tp=2,4,8 #3389

Merged
merged 11 commits into from
Mar 14, 2024
Prev Previous commit
Next Next commit
expose vllm.model_executor.layers.fused_moe.get_config_file_name
  • Loading branch information
youkaichao committed Mar 14, 2024
commit b4a60f588ca0b07d9fa6cee9023e8f035c55bc76
2 changes: 1 addition & 1 deletion benchmarks/kernels/benchmark_mixtral_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from vllm.model_executor.layers.fused_moe.fused_moe import fused_moe, get_config_file_name
from vllm.model_executor.layers.fused_moe import fused_moe, get_config_file_name
import torch
import torch.nn.functional as F
import triton
Expand Down
3 changes: 2 additions & 1 deletion vllm/model_executor/layers/fused_moe/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from vllm.model_executor.layers.fused_moe.fused_moe import fused_moe
from vllm.model_executor.layers.fused_moe.fused_moe import fused_moe, get_config_file_name

__all__ = [
"fused_moe",
"get_config_file_name",
]
Loading