-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Misc] Forward request-level prompt extras for cross-encoder scoring
frontend
#46939
opened Jun 28, 2026 by
taneem-ibrahim
Contributor
Loading…
[Benchmark] Report prefix cache hit rate in vllm bench serve
performance
Performance-related issues
#46938
opened Jun 28, 2026 by
yuyz-cyber
Loading…
[Kernel][Test] Enable HND layout testing for Triton reshape_and_cache…
performance
Performance-related issues
#46936
opened Jun 28, 2026 by
Aditya-Nannapaneni
Loading…
[WIP][Perf] AsyncTP fusion for dynamic per-group FP8 scaled_mm + comms
#46935
opened Jun 28, 2026 by
Monishver11
Contributor
•
Draft
[Bugfix][GB10] Fix negative CUDA graph memory estimate on unified-memory GPUs (#44740)
bug
Something isn't working
nvidia
v1
#46932
opened Jun 27, 2026 by
WindChimeRan
Contributor
Loading…
3 of 4 tasks
[Hardware][AMD][CI] Tweak mirrored tests; improve CI base dependency change detection
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#46930
opened Jun 27, 2026 by
mawong-amd
Contributor
Loading…
4 tasks
[Frontend] Add idle timeout for /v1/realtime audio sessions
frontend
#46926
opened Jun 27, 2026 by
GodlyDonuts
Loading…
[Bugfix] Resolve $ref/$defs in tool schemas before type coercion
bug
Something isn't working
tool-calling
#46925
opened Jun 27, 2026 by
ben7am1n
Loading…
docs: add OpenAI server production hardening checklist
documentation
Improvements or additions to documentation
#46922
opened Jun 27, 2026 by
alexchenyu
Loading…
[Metrics] Skip unknown metric types in
get_metrics_snapshot()
v1
#46920
opened Jun 27, 2026 by
mridullpandey
Loading…
4 tasks done
fix(flashinfer): guard trtllm MoE behind x86_64 check
nvidia
#46917
opened Jun 27, 2026 by
matdou
Loading…
5 of 8 tasks
[communication] [bugfix] fix quickreduce acc error in cudagraph mode
bug
Something isn't working
nvidia
#46913
opened Jun 27, 2026 by
haoyangli0109
Contributor
Loading…
[Hybird][PrefixCache] Pre-copy-free align prefix cache for model runner V1 and V2
v1
#46912
opened Jun 27, 2026 by
izhuhaoran
Contributor
•
Draft
[Perf] Fuse DFlash cache insert kernel
qwen
Related to Qwen models
#46911
opened Jun 27, 2026 by
gcanlin
Contributor
Loading…
4 tasks
[Bugfix] Handle list slot mappings in attention context
bug
Something isn't working
#46908
opened Jun 27, 2026 by
zupengwang
Loading…
[CPU][Bugfix] Build cpu_fused_moe on Apple Silicon
bug
Something isn't working
ci/build
cpu
Related to CPU backends
#46907
opened Jun 27, 2026 by
yuyz-cyber
Loading…
3 of 4 tasks
[KVConnector] Decouple store retention from HBM retention
kv-connector
v1
#46906
opened Jun 27, 2026 by
lHrHenry233
Contributor
Loading…
[ROCm][CI] Refresh ROCm base images when docker rocm_base changes
ci/build
rocm
Related to AMD ROCm
#46904
opened Jun 27, 2026 by
AndreasKaratzas
Member
•
Draft
Feat/oscar kv
documentation
Improvements or additions to documentation
nvidia
v1
#46903
opened Jun 27, 2026 by
pranavthakur0-0
Loading…
Bump the minor-update group across 1 directory with 149 updates
ci/build
dependencies
Pull requests that update a dependency file
nvidia
rocm
Related to AMD ROCm
#46902
opened Jun 27, 2026 by
dependabot
Bot
Loading…
[MoE] [MoE Refactor] Migrate int8 w4a8int8 oracle 37753
#46901
opened Jun 27, 2026 by
qyYue1389
Contributor
Loading…
[Docs] Add Phi-4-mini-instruct to batch invariance tested models
documentation
Improvements or additions to documentation
#46900
opened Jun 27, 2026 by
CHIPMUNK-T0T
Loading…
4 tasks done
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.