[RFC]: Blackwell Enablement for vLLM (SM100)

### Motivation.

We are in the process of making incremental changes for Blackwell Support in vLLM. This issue is a tracker for all the items that are planned.

### Planned or In Progress Features

The following items are either planned or currently in progress to enable vLLM support on Blackwell.

- [ ] Enable NVFP4 Support
  - [x] (NVIDIA) Add functional support for NVFP4 Kernels for linear layers
  - [x] (NVIDIA) Add functional support for NVFP4 MoE Kernels
  - [x] (NVIDIA) Add Model Integration for nvidia/*-FP4 models
  - [ ] Finetune GEMM configurations for Blackwell 
  - [ ] (NVIDIA) Optimize MoE for Latency
  - [ ] (NVIDIA) Optimize MoE for Throughput [FI: PR !1113](https://github.com/flashinfer-ai/flashinfer/pull/1113)
  - [ ] (NVIDIA) MoE All Reduce Fusion [FI: PR !1108 ](https://github.com/flashinfer-ai/flashinfer/pull/1108)
  

- [ ] Optimize communication overlap ops
    - [ ] (NVIDIA) Enable NCCL’s symmetric memory
    - [ ] (NVIDIA) Add support for Gemm + comm overlap

- [ ] Blackwell Attention Kernels
   - [x] (NVIDIA) Integrate Cutlass MLA Kernels #17625 
   - [ ] (NVIDIA) Integrate vLLM v1-compatible Blackwell prefill and decode GQA kernels [FI: PR !1051](https://github.com/flashinfer-ai/flashinfer/pull/1051) 
   

- [ ] FP8 Blockscale Gemm and MoE

   - [x] (NVIDIA) FP8 Blockscale GEMM
   - [x] (NVIDIA) FP8 Blockscale gemm optimizations: #18564 
   - [ ] (NVIDIA) FP8 Blockscale MoE
   - [ ] (NVIDIA) Latency and throughput optimizations 

- [ ] MTP support 
 


### Feedback Period.

_No response_

### CC List.

@kushanam @kaixih 

### Any Other Things.

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[RFC]: Blackwell Enablement for vLLM (SM100) #18153

Motivation.

Planned or In Progress Features

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[RFC]: Blackwell Enablement for vLLM (SM100) #18153

Description

Motivation.

Planned or In Progress Features

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions