-
Notifications
You must be signed in to change notification settings - Fork 222
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PromptLogprobs][V1] Support prompt logprobs to fix ceval accuracy in V1
dense-accuracy-test
enable dense accuracy test for PR
ready-for-test
start test by label for PR
#1483
opened Jun 27, 2025 by
MengqingCao
Loading…
[CI] Pin transformers<4.53.0 and fix EPLB load_weights to make CI passed
#1482
opened Jun 27, 2025 by
MengqingCao
Loading…
[BugFix]Fix bugs when initializing communication groups with dp on 300I Duo
#1478
opened Jun 27, 2025 by
Angazenn
Loading…
support pangumoe w8a8c8 and docs
documentation
Improvements or additions to documentation
module:core
module:quantization
#1477
opened Jun 27, 2025 by
GDzhu01
Loading…
[cherry-pick] Backport multistream MLA fixes and TP communication optimizations
module:ops
module:tests
#1474
opened Jun 27, 2025 by
sdmyzlp
Loading…
[bugifx] fix chunked_prefill_mla output for MTP
module:ops
#1473
opened Jun 27, 2025 by
underfituu
Loading…
[v0.9.1][perf][WIP] Replace the combination of npu_swiglu -> npu_dynamic_quant wit…
module:quantization
#1471
opened Jun 27, 2025 by
linfeng-yuan
Loading…
[main]Refactoring w4a8 and w8a8 and supporting deepseek w4a8
module:quantization
module:tests
#1469
opened Jun 26, 2025 by
pichangping
Loading…
【BUGFIXED】Fix the mtp inference error when the prompt is long or Chinese.
module:ops
#1468
opened Jun 26, 2025 by
Irving11-BKN
Loading…
[WIP]support H2P communication optimization for PanguProMoe
#1463
opened Jun 26, 2025 by
Angazenn
Loading…
[Structured Output] Remove redundant check for
grammar_bitmask
#1459
opened Jun 26, 2025 by
shen-shanshan
Loading…
[Doc] Update accuracy reports for main
documentation
Improvements or additions to documentation
#1439
opened Jun 25, 2025 by
vllm-ascend-ci
Loading…
Add toy_proxy_server chat/start_profile/stop_profile api
#1437
opened Jun 25, 2025 by
machenglong2025
Loading…
support w8a8c8
module:core
module:quantization
module:tests
#1436
opened Jun 25, 2025 by
GDzhu01
Loading…
add chunk mc2 for prefill
module:core
module:quantization
#1434
opened Jun 25, 2025 by
NNUCJ
Loading…
[V0.9.1] Optimize perf of Qwen3
module:ops
module:quantization
#1431
opened Jun 25, 2025 by
rjg-lyh
Loading…
[Doc] Add multi-npu qwen3-MoE-32B Tutorials
documentation
Improvements or additions to documentation
#1419
opened Jun 25, 2025 by
leo-pony
Loading…
Br fix multi stream moe
merge-conflicts
module:ops
module:tests
#1417
opened Jun 25, 2025 by
sdmyzlp
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.