-
-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP][Bugfix] Fix MLA attention crash with AWQ/GPTQ quantized models
bug
Something isn't working
#34695
opened Feb 17, 2026 by
haosdent
Loading…
[Bugfix] Fix mypy errors for StructuredOutputsParams by using stdlib dataclass
bug
Something isn't working
#34693
opened Feb 17, 2026 by
hyeongyun0916
Loading…
3 of 5 tasks
[ROCm] Enable DeepEP ROCm as all2allbackend for AMD GPUs.
rocm
Related to AMD ROCm
#34692
opened Feb 17, 2026 by
lcskrishna
•
Draft
5 tasks
[WIP][Bugfix] Fix xgrammar nanobind leaked objects at shutdown
bug
Something isn't working
structured-output
v1
#34690
opened Feb 17, 2026 by
haosdent
Loading…
[ROCm] Enable bitsandbytes quantization support on ROCm
ci/build
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#34688
opened Feb 17, 2026 by
Abdennacer-Badaoui
Loading…
1 task done
[Update] Use FlashInfer fast_decode_plan directly instead of replication
nvidia
v1
#34687
opened Feb 17, 2026 by
askliar
Loading…
[kv_offload+HMA][2/N]: Support sliding window lookup
kv-connector
v1
#34682
opened Feb 17, 2026 by
orozery
Loading…
[kv_offload+HMA][1/N]: Worker-side support for multiple HMA groups
v1
#34680
opened Feb 17, 2026 by
orozery
Loading…
[Docs]Fix documentation formatting in architecture overview
documentation
Improvements or additions to documentation
#34679
opened Feb 17, 2026 by
lichuang
Loading…
5 tasks
[GGUF][Model] Add Qwen3-Coder-Next GGUF support
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
#34678
opened Feb 17, 2026 by
rudybear
Loading…
6 tasks done
[Bugfix][CPU] Fix basic unit tests failing in CPU platforms
bug
Something isn't working
nvidia
#34677
opened Feb 17, 2026 by
jasonyanwenl
Loading…
3 of 5 tasks
Add VLLM_SKIP_MODEL_VALIDATION environment variable
frontend
#34676
opened Feb 17, 2026 by
dsingal0
Loading…
5 tasks
[Bugfix][MOE] Fix incorrect routing selection for models without expert groups (e.g., MiniMax-M2.1)
bug
Something isn't working
nvidia
#34673
opened Feb 17, 2026 by
wwl2755
Loading…
[ci] Add Ray compatibility check informational CI job
ci/build
#34672
opened Feb 17, 2026 by
jeffreywang-anyscale
•
Draft
2 of 7 tasks
Update max_num_tokens value when specdec is enabled
v1
#34671
opened Feb 17, 2026 by
shaharmor98
Loading…
5 tasks
[Core] Default to "mp" rather than "uni" distributed executor backend
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#34670
opened Feb 17, 2026 by
njhill
Loading…
[Reasoning] [Draft][WIP] Support for speculative decoding with thinking budget
frontend
v1
#34668
opened Feb 17, 2026 by
rishitdholakia13
•
Draft
[Bugfix] Fix benchmark_fused_collective crash on CustomOp init
bug
Something isn't working
performance
Performance-related issues
#34665
opened Feb 17, 2026 by
mayank-ketkar-sf
Loading…
3 tasks done
Separate TRTLLM and Flashinfer backends
documentation
Improvements or additions to documentation
nvidia
v1
#34663
opened Feb 17, 2026 by
pavanimajety
•
Draft
5 tasks
[Kernel][Perf] Fuse gather_block_tables + compute_slot_mappings into single kernel
performance
Performance-related issues
v1
#34660
opened Feb 17, 2026 by
mayank-ketkar-sf
Loading…
3 of 5 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.