-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
habana_main rebase #71
Commits on Jun 13, 2024
-
[Hardware][Intel] Optimize CPU backend and add more performance tips (v…
…llm-project#4971) Co-authored-by: Jianan Gu <jianan.gu@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 80aa7e9 - Browse repository at this point
Copy the full SHA 80aa7e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for a65634d - Browse repository at this point
Copy the full SHA a65634dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 03dccc8 - Browse repository at this point
Copy the full SHA 03dccc8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3987347 - Browse repository at this point
Copy the full SHA 3987347View commit details -
[Doc] Update LLaVA docs (vllm-project#5437)
Co-authored-by: Roger Wang <ywang@roblox.com>
Configuration menu - View commit details
-
Copy full SHA for 0ce7b95 - Browse repository at this point
Copy the full SHA 0ce7b95View commit details -
[Kernel] Factor out epilogues from cutlass kernels (vllm-project#5391)
Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: zifeitong <zifei.tong@parasail.io> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 85657b5 - Browse repository at this point
Copy the full SHA 85657b5View commit details -
[MISC] Remove FP8 warning (vllm-project#5472)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 30299a4 - Browse repository at this point
Copy the full SHA 30299a4View commit details -
Configuration menu - View commit details
-
Copy full SHA for a8fda4f - Browse repository at this point
Copy the full SHA a8fda4fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b0511a - Browse repository at this point
Copy the full SHA 6b0511aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1696efe - Browse repository at this point
Copy the full SHA 1696efeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 33e3b37 - Browse repository at this point
Copy the full SHA 33e3b37View commit details -
Configuration menu - View commit details
-
Copy full SHA for e38042d - Browse repository at this point
Copy the full SHA e38042dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 50eed24 - Browse repository at this point
Copy the full SHA 50eed24View commit details -
Configuration menu - View commit details
-
Copy full SHA for cd9c0d6 - Browse repository at this point
Copy the full SHA cd9c0d6View commit details
Commits on Jun 14, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 55d6361 - Browse repository at this point
Copy the full SHA 55d6361View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0f0d8bc - Browse repository at this point
Copy the full SHA 0f0d8bcView commit details -
[CI/Build][Misc] Add CI that benchmarks vllm performance on those PRs…
… with `perf-benchmarks` label (vllm-project#5073) Co-authored-by: simon-mo <simon.mo@hey.com>
Configuration menu - View commit details
-
Copy full SHA for 319ad7f - Browse repository at this point
Copy the full SHA 319ad7fView commit details -
Configuration menu - View commit details
-
Copy full SHA for d47af2b - Browse repository at this point
Copy the full SHA d47af2bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 703475f - Browse repository at this point
Copy the full SHA 703475fView commit details -
Configuration menu - View commit details
-
Copy full SHA for d74674b - Browse repository at this point
Copy the full SHA d74674bView commit details -
[ Misc ] Rs/compressed tensors cleanup (vllm-project#5432)
Co-authored-by: mgoin <michael@neuralmagic.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 1598568 - Browse repository at this point
Copy the full SHA 1598568View commit details -
Configuration menu - View commit details
-
Copy full SHA for 348616a - Browse repository at this point
Copy the full SHA 348616aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 48f589e - Browse repository at this point
Copy the full SHA 48f589eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 77490c6 - Browse repository at this point
Copy the full SHA 77490c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for d1c3d7d - Browse repository at this point
Copy the full SHA d1c3d7dView commit details -
Configuration menu - View commit details
-
Copy full SHA for cdab68d - Browse repository at this point
Copy the full SHA cdab68dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e2527a - Browse repository at this point
Copy the full SHA 6e2527aView commit details -
[Bugfix] Enable loading FP8 checkpoints for gpt_bigcode models (vllm-…
…project#5460) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for e2afb03 - Browse repository at this point
Copy the full SHA e2afb03View commit details -
Configuration menu - View commit details
-
Copy full SHA for 28c145e - Browse repository at this point
Copy the full SHA 28c145eView commit details -
Configuration menu - View commit details
-
Copy full SHA for f5bb85b - Browse repository at this point
Copy the full SHA f5bb85bView commit details
Commits on Jun 15, 2024
-
Configuration menu - View commit details
-
Copy full SHA for bd7efe9 - Browse repository at this point
Copy the full SHA bd7efe9View commit details -
[Core][Bugfix]: fix prefix caching for blockv2 (vllm-project#5364)
Signed-off-by: Lei Wen <wenlei03@qiyi.com> Co-authored-by: Lei Wen <wenlei03@qiyi.com>
Configuration menu - View commit details
-
Copy full SHA for 1b8a0d7 - Browse repository at this point
Copy the full SHA 1b8a0d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0e9164b - Browse repository at this point
Copy the full SHA 0e9164bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 81fbb36 - Browse repository at this point
Copy the full SHA 81fbb36View commit details -
Configuration menu - View commit details
-
Copy full SHA for e691918 - Browse repository at this point
Copy the full SHA e691918View commit details -
Configuration menu - View commit details
-
Copy full SHA for d919ecc - Browse repository at this point
Copy the full SHA d919eccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1c0afa1 - Browse repository at this point
Copy the full SHA 1c0afa1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3ce2c05 - Browse repository at this point
Copy the full SHA 3ce2c05View commit details
Commits on Jun 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f31c1f9 - Browse repository at this point
Copy the full SHA f31c1f9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4a67690 - Browse repository at this point
Copy the full SHA 4a67690View commit details -
Configuration menu - View commit details
-
Copy full SHA for f07d513 - Browse repository at this point
Copy the full SHA f07d513View commit details
Commits on Jun 17, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 845a3f2 - Browse repository at this point
Copy the full SHA 845a3f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for e2b85cf - Browse repository at this point
Copy the full SHA e2b85cfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9333fb8 - Browse repository at this point
Copy the full SHA 9333fb8View commit details -
Correct alignment in the seq_len diagram. (vllm-project#5592)
Co-authored-by: Liqian Chen <liqian.chen@deeplang.ai>
Configuration menu - View commit details
-
Copy full SHA for 9e74d9d - Browse repository at this point
Copy the full SHA 9e74d9dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 890d8d9 - Browse repository at this point
Copy the full SHA 890d8d9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1f12122 - Browse repository at this point
Copy the full SHA 1f12122View commit details -
[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (vllm-proj…
…ect#3814) Co-authored-by: Jiang Li <jiang1.li@intel.com> Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 728c4c8 - Browse repository at this point
Copy the full SHA 728c4c8View commit details -
Configuration menu - View commit details
-
Copy full SHA for ab66536 - Browse repository at this point
Copy the full SHA ab66536View commit details -
[CI] the readability of benchmarking and prepare for dashboard (vllm-…
…project#5571) [CI] Improve the readability of performance benchmarking results and prepare for upcoming performance dashboard (vllm-project#5571)
Configuration menu - View commit details
-
Copy full SHA for 9e4e6fe - Browse repository at this point
Copy the full SHA 9e4e6feView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b44aaf - Browse repository at this point
Copy the full SHA 1b44aafView commit details -
Configuration menu - View commit details
-
Copy full SHA for e441bad - Browse repository at this point
Copy the full SHA e441badView commit details -
Configuration menu - View commit details
-
Copy full SHA for a3e8a05 - Browse repository at this point
Copy the full SHA a3e8a05View commit details -
Configuration menu - View commit details
-
Copy full SHA for 26e1188 - Browse repository at this point
Copy the full SHA 26e1188View commit details
Commits on Jun 18, 2024
-
[Speculative Decoding 1/2 ] Add typical acceptance sampling as one of…
… the sampling techniques in the verifier (vllm-project#5131)
Configuration menu - View commit details
-
Copy full SHA for fa9e385 - Browse repository at this point
Copy the full SHA fa9e385View commit details -
Configuration menu - View commit details
-
Copy full SHA for daef218 - Browse repository at this point
Copy the full SHA daef218View commit details -
[Kernel] Add punica dimensions for Granite 13b (vllm-project#5559)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 5002175 - Browse repository at this point
Copy the full SHA 5002175View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8eadcf0 - Browse repository at this point
Copy the full SHA 8eadcf0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 32c86e4 - Browse repository at this point
Copy the full SHA 32c86e4View commit details -
[CI] Avoid naming different metrics with the same name in performance…
… benchmark (vllm-project#5615)
Configuration menu - View commit details
-
Copy full SHA for 114d727 - Browse repository at this point
Copy the full SHA 114d727View commit details -
[bugfix][distributed] improve p2p capability test (vllm-project#5612)
[bugfix][distributed] do not error if two processes do not agree on p2p capability (vllm-project#5612)
Configuration menu - View commit details
-
Copy full SHA for db5ec52 - Browse repository at this point
Copy the full SHA db5ec52View commit details -
Configuration menu - View commit details
-
Copy full SHA for f0cc0e6 - Browse repository at this point
Copy the full SHA f0cc0e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ad7b53 - Browse repository at this point
Copy the full SHA 4ad7b53View commit details -
[ci] Deprecate original CI template (vllm-project#5624)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 13db436 - Browse repository at this point
Copy the full SHA 13db436View commit details -
[Misc] Add OpenTelemetry support (vllm-project#4687)
This PR adds basic support for OpenTelemetry distributed tracing. It includes changes to enable tracing functionality and improve monitoring capabilities. I've also added a markdown with print-screens to guide users how to use this feature. You can find it here
Configuration menu - View commit details
-
Copy full SHA for 7879f24 - Browse repository at this point
Copy the full SHA 7879f24View commit details -
[Misc] Add channel-wise quantization support for w8a8 dynamic per tok…
…en activation quantization (vllm-project#5542)
Configuration menu - View commit details
-
Copy full SHA for 95db455 - Browse repository at this point
Copy the full SHA 95db455View commit details -
[ci] Setup Release pipeline and build release wheels with cache (vllm…
…-project#5610) Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 19091ef - Browse repository at this point
Copy the full SHA 19091efView commit details -
Configuration menu - View commit details
-
Copy full SHA for 07feecd - Browse repository at this point
Copy the full SHA 07feecdView commit details -
[Bugfix] Fix for inconsistent behaviour related to sampling and repet…
…ition penalties (vllm-project#5639) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 8a17338 - Browse repository at this point
Copy the full SHA 8a17338View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2bd231a - Browse repository at this point
Copy the full SHA 2bd231aView commit details -
Configuration menu - View commit details
-
Copy full SHA for b23ce92 - Browse repository at this point
Copy the full SHA b23ce92View commit details
Commits on Jun 19, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6820724 - Browse repository at this point
Copy the full SHA 6820724View commit details -
Configuration menu - View commit details
-
Copy full SHA for 59a1eb5 - Browse repository at this point
Copy the full SHA 59a1eb5View commit details -
[Bugfix] Added test for sampling repetition penalty bug. (vllm-projec…
…t#5659) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for e5150f2 - Browse repository at this point
Copy the full SHA e5150f2View commit details -
[Bugfix][CI/Build][AMD][ROCm]Fixed the cmake build bug which generate…
… garbage on certain devices (vllm-project#5641)
Configuration menu - View commit details
-
Copy full SHA for f758aed - Browse repository at this point
Copy the full SHA f758aedView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3eea748 - Browse repository at this point
Copy the full SHA 3eea748View commit details -
Configuration menu - View commit details
-
Copy full SHA for da971ec - Browse repository at this point
Copy the full SHA da971ecView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7d46c8d - Browse repository at this point
Copy the full SHA 7d46c8dView commit details -
Configuration menu - View commit details
-
Copy full SHA for d871453 - Browse repository at this point
Copy the full SHA d871453View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9c2732 - Browse repository at this point
Copy the full SHA e9c2732View commit details -
[ci] Add A100 queue into AWS CI template (vllm-project#5648)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 3ee5c4b - Browse repository at this point
Copy the full SHA 3ee5c4bView commit details -
[Frontend][Bugfix] Fix preemption_mode -> preemption-mode for CLI arg…
… in arg_utils.py (vllm-project#5688)
Configuration menu - View commit details
-
Copy full SHA for afed90a - Browse repository at this point
Copy the full SHA afed90aView commit details -
Configuration menu - View commit details
-
Copy full SHA for d571ca0 - Browse repository at this point
Copy the full SHA d571ca0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7868750 - Browse repository at this point
Copy the full SHA 7868750View commit details -
[Doc] Update docker references (vllm-project#5614)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for e83db9e - Browse repository at this point
Copy the full SHA e83db9eView commit details -
[Misc] Add per channel support for static activation quantization; up…
…date w8a8 schemes to share base classes (vllm-project#5650)
Configuration menu - View commit details
-
Copy full SHA for 4a30d7e - Browse repository at this point
Copy the full SHA 4a30d7eView commit details -
[ci] Limit num gpus if specified for A100 (vllm-project#5694)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 949e49a - Browse repository at this point
Copy the full SHA 949e49aView commit details
Commits on Jun 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 3730a1c - Browse repository at this point
Copy the full SHA 3730a1cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b2eaac - Browse repository at this point
Copy the full SHA 1b2eaacView commit details -
[Kernel] Update Cutlass int8 kernel configs for SM90 (vllm-project#5514)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for 111af1f - Browse repository at this point
Copy the full SHA 111af1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for ad137cd - Browse repository at this point
Copy the full SHA ad137cdView commit details -
[Kernel] Update Cutlass int8 kernel configs for SM80 (vllm-project#5275)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for a7dcc62 - Browse repository at this point
Copy the full SHA a7dcc62View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3f3b6b2 - Browse repository at this point
Copy the full SHA 3f3b6b2View commit details -
[Frontend] Add FlexibleArgumentParser to support both underscore and …
…dash in names (vllm-project#5718)
Configuration menu - View commit details
-
Copy full SHA for 8065a7e - Browse repository at this point
Copy the full SHA 8065a7eView commit details
Commits on Jun 21, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6c5b7af - Browse repository at this point
Copy the full SHA 6c5b7afView commit details -
[Model] MLPSpeculator speculative decoding support (vllm-project#4947)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Davis Wertheimer <Davis.Wertheimer@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for b12518d - Browse repository at this point
Copy the full SHA b12518dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1f56742 - Browse repository at this point
Copy the full SHA 1f56742View commit details -
Configuration menu - View commit details
-
Copy full SHA for c35e4a3 - Browse repository at this point
Copy the full SHA c35e4a3View commit details -
[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora (vllm-…
…project#5665) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for 67005a0 - Browse repository at this point
Copy the full SHA 67005a0View commit details -
[Core][Distributed] add shm broadcast (vllm-project#5399)
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for d9a252b - Browse repository at this point
Copy the full SHA d9a252bView commit details -
Configuration menu - View commit details
-
Copy full SHA for bd620b0 - Browse repository at this point
Copy the full SHA bd620b0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5b15bde - Browse repository at this point
Copy the full SHA 5b15bdeView commit details -
Configuration menu - View commit details
-
Copy full SHA for f1e72cc - Browse repository at this point
Copy the full SHA f1e72ccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7187507 - Browse repository at this point
Copy the full SHA 7187507View commit details -
Configuration menu - View commit details
-
Copy full SHA for f5dda63 - Browse repository at this point
Copy the full SHA f5dda63View commit details
Commits on Jun 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for cf90ae0 - Browse repository at this point
Copy the full SHA cf90ae0View commit details -
[Model] Support Qwen-VL and Qwen-VL-Chat models with text-only inputs (…
…vllm-project#5710) Co-authored-by: Roger Wang <ywang@roblox.com>
Configuration menu - View commit details
-
Copy full SHA for 9c62db0 - Browse repository at this point
Copy the full SHA 9c62db0View commit details -
[Misc] Remove vllm-project#4789 workaround left in vllm/entrypoints/o…
…penai/run_batch.py (vllm-project#5756)
Configuration menu - View commit details
-
Copy full SHA for ff9ddbc - Browse repository at this point
Copy the full SHA ff9ddbcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0cbc1d2 - Browse repository at this point
Copy the full SHA 0cbc1d2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8c00f9c - Browse repository at this point
Copy the full SHA 8c00f9cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 832ea88 - Browse repository at this point
Copy the full SHA 832ea88View commit details
Commits on Jun 23, 2024
-
[BugFix] [Kernel] Add Cutlass2x fallback kernels (vllm-project#5744)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for 6c916ac - Browse repository at this point
Copy the full SHA 6c916acView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d4d905 - Browse repository at this point
Copy the full SHA 5d4d905View commit details
Commits on Jun 24, 2024
-
Configuration menu - View commit details
-
Copy full SHA for edd5fe5 - Browse repository at this point
Copy the full SHA edd5fe5View commit details -
Configuration menu - View commit details
-
Copy full SHA for c246212 - Browse repository at this point
Copy the full SHA c246212View commit details -
Configuration menu - View commit details
-
Copy full SHA for a2899d5 - Browse repository at this point
Copy the full SHA a2899d5View commit details -
Configuration menu - View commit details
-
Copy full SHA for fc6d4b4 - Browse repository at this point
Copy the full SHA fc6d4b4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 126c607 - Browse repository at this point
Copy the full SHA 126c607View commit details -
Configuration menu - View commit details
-
Copy full SHA for e72dc6c - Browse repository at this point
Copy the full SHA e72dc6cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1744cc9 - Browse repository at this point
Copy the full SHA 1744cc9View commit details -
Configuration menu - View commit details
-
Copy full SHA for ba991d5 - Browse repository at this point
Copy the full SHA ba991d5View commit details
Commits on Jun 25, 2024
-
[ci] Remove aws template (vllm-project#5757)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for e9de9dd - Browse repository at this point
Copy the full SHA e9de9ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for f23871e - Browse repository at this point
Copy the full SHA f23871eView commit details -
[Speculative Decoding] Support draft model on different tensor-paral…
…lel size than target model (vllm-project#5414)
Configuration menu - View commit details
-
Copy full SHA for 2ce5d66 - Browse repository at this point
Copy the full SHA 2ce5d66View commit details -
Configuration menu - View commit details
-
Copy full SHA for d12bff7 - Browse repository at this point
Copy the full SHA d12bff7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 43ff60b - Browse repository at this point
Copy the full SHA 43ff60bView commit details -
Configuration menu - View commit details
-
Copy full SHA for efce3c4 - Browse repository at this point
Copy the full SHA efce3c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for c1e7589 - Browse repository at this point
Copy the full SHA c1e7589View commit details -
Configuration menu - View commit details
-
Copy full SHA for 58bd037 - Browse repository at this point
Copy the full SHA 58bd037View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1d6409b - Browse repository at this point
Copy the full SHA 1d6409bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7b99314 - Browse repository at this point
Copy the full SHA 7b99314View commit details -
Configuration menu - View commit details
-
Copy full SHA for 952b7c4 - Browse repository at this point
Copy the full SHA 952b7c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 67882db - Browse repository at this point
Copy the full SHA 67882dbView commit details -
Configuration menu - View commit details
-
Copy full SHA for cf04c81 - Browse repository at this point
Copy the full SHA cf04c81View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2b850fe - Browse repository at this point
Copy the full SHA 2b850feView commit details -
Configuration menu - View commit details
-
Copy full SHA for c18ebfd - Browse repository at this point
Copy the full SHA c18ebfdView commit details -
Configuration menu - View commit details
-
Copy full SHA for d9b34ba - Browse repository at this point
Copy the full SHA d9b34baView commit details -
Configuration menu - View commit details
-
Copy full SHA for dd248f7 - Browse repository at this point
Copy the full SHA dd248f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for bc34937 - Browse repository at this point
Copy the full SHA bc34937View commit details -
[Hardware][AMD][CI/Build][Doc] Upgrade to ROCm 6.1, Dockerfile improv…
…ements, test fixes (vllm-project#5422)
Configuration menu - View commit details
-
Copy full SHA for dd793d1 - Browse repository at this point
Copy the full SHA dd793d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for f178e56 - Browse repository at this point
Copy the full SHA f178e56View commit details
Commits on Jun 26, 2024
-
[CI/Build] Add E2E tests for MLPSpeculator (vllm-project#5791)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for c2a8ac7 - Browse repository at this point
Copy the full SHA c2a8ac7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8207972 - Browse repository at this point
Copy the full SHA 8207972View commit details -
[Core] Refactor Worker and ModelRunner to consolidate control plane c…
…ommunication (vllm-project#5408) Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Stephanie <swang@anyscale.com> Co-authored-by: Stephanie <swang@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for dda4811 - Browse repository at this point
Copy the full SHA dda4811View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3aa7b6c - Browse repository at this point
Copy the full SHA 3aa7b6cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 515080a - Browse repository at this point
Copy the full SHA 515080aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6806998 - Browse repository at this point
Copy the full SHA 6806998View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3439c5a - Browse repository at this point
Copy the full SHA 3439c5aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6984c02 - Browse repository at this point
Copy the full SHA 6984c02View commit details -
[Kernel] Adding bias epilogue support for
cutlass_scaled_mm
(vllm-p……roject#5560) Co-authored-by: Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for 5bfd1bb - Browse repository at this point
Copy the full SHA 5bfd1bbView commit details -
Configuration menu - View commit details
-
Copy full SHA for c54269d - Browse repository at this point
Copy the full SHA c54269dView commit details -
Configuration menu - View commit details
-
Copy full SHA for cbc53b6 - Browse repository at this point
Copy the full SHA cbc53b6View commit details -
Configuration menu - View commit details
-
Copy full SHA for f5c8628 - Browse repository at this point
Copy the full SHA f5c8628View commit details -
Configuration menu - View commit details
-
Copy full SHA for 38a1674 - Browse repository at this point
Copy the full SHA 38a1674View commit details -
Configuration menu - View commit details
-
Copy full SHA for 294104c - Browse repository at this point
Copy the full SHA 294104cView commit details
Commits on Jun 27, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b9e8425 - Browse repository at this point
Copy the full SHA b9e8425View commit details -
[BugFix] Fix cuda graph for MLPSpeculator (vllm-project#5875)
Co-authored-by: Abhinav Goyal <abhinav.goyal@flipkart.com>
Configuration menu - View commit details
-
Copy full SHA for 2110557 - Browse repository at this point
Copy the full SHA 2110557View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6eabc6c - Browse repository at this point
Copy the full SHA 6eabc6cView commit details -
[VLM][Bugfix] Make sure that
multi_modal_kwargs
is broadcasted prop……erly (vllm-project#5880) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for d12af20 - Browse repository at this point
Copy the full SHA d12af20View commit details -
Configuration menu - View commit details
-
Copy full SHA for 96354d6 - Browse repository at this point
Copy the full SHA 96354d6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2061f0b - Browse repository at this point
Copy the full SHA 2061f0bView commit details -
Configuration menu - View commit details
-
Copy full SHA for e36df83 - Browse repository at this point
Copy the full SHA e36df83View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9d32d0 - Browse repository at this point
Copy the full SHA e9d32d0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1fd06cc - Browse repository at this point
Copy the full SHA 1fd06ccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 940f525 - Browse repository at this point
Copy the full SHA 940f525View commit details -
Configuration menu - View commit details
-
Copy full SHA for 98cf2ed - Browse repository at this point
Copy the full SHA 98cf2edView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3fd02bd - Browse repository at this point
Copy the full SHA 3fd02bdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 691e29e - Browse repository at this point
Copy the full SHA 691e29eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 365791f - Browse repository at this point
Copy the full SHA 365791fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 736ed38 - Browse repository at this point
Copy the full SHA 736ed38View commit details -
Configuration menu - View commit details
-
Copy full SHA for 79c92c7 - Browse repository at this point
Copy the full SHA 79c92c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 64e8d2a - Browse repository at this point
Copy the full SHA 64e8d2aView commit details -
Configuration menu - View commit details
-
Copy full SHA for c3dde36 - Browse repository at this point
Copy the full SHA c3dde36View commit details
Commits on Jun 28, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f136da1 - Browse repository at this point
Copy the full SHA f136da1View commit details -
[VLM][BugFix] Make sure that
multi_modal_kwargs
can broadcast prope……rly with ring buffer. (vllm-project#5905) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com>
Configuration menu - View commit details
-
Copy full SHA for 74d55c0 - Browse repository at this point
Copy the full SHA 74d55c0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d0e3a4 - Browse repository at this point
Copy the full SHA 0d0e3a4View commit details -
[Core] Registry for processing model inputs (vllm-project#5214)
Co-authored-by: ywang96 <ywang@roblox.com>
Configuration menu - View commit details
-
Copy full SHA for 5cbe8d1 - Browse repository at this point
Copy the full SHA 5cbe8d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5932634 - Browse repository at this point
Copy the full SHA 5932634View commit details -
Configuration menu - View commit details
-
Copy full SHA for 57f09a4 - Browse repository at this point
Copy the full SHA 57f09a4View commit details -
[Bugfix] Better error message for MLPSpeculator when `num_speculative…
…_tokens` is set too high (vllm-project#5894) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for ec1ad00 - Browse repository at this point
Copy the full SHA ec1ad00View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3b752a6 - Browse repository at this point
Copy the full SHA 3b752a6View commit details -
[Distributed] Make it clear that % should not be in tensor dict keys. (…
…vllm-project#5927) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for b90d8cd - Browse repository at this point
Copy the full SHA b90d8cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for b2c6202 - Browse repository at this point
Copy the full SHA b2c6202View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6a2d659 - Browse repository at this point
Copy the full SHA 6a2d659View commit details -
[ Misc ] Remove
fp8_shard_indexer
from Col/Row Parallel Linear (Sim……plify Weight Loading) (vllm-project#5928) Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for b185230 - Browse repository at this point
Copy the full SHA b185230View commit details -
[ Bugfix ] Enabling Loading Models With Fused QKV/MLP on Disk with FP8 (
vllm-project#5921) Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for 2cd402e - Browse repository at this point
Copy the full SHA 2cd402eView commit details -
Support Deepseek-V2 (vllm-project#4650)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for be0b3af - Browse repository at this point
Copy the full SHA be0b3afView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4bf35ed - Browse repository at this point
Copy the full SHA 4bf35edView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d2a1a9 - Browse repository at this point
Copy the full SHA 5d2a1a9View commit details -
[Bugfix] Fix Engine Failing After Invalid Request - AsyncEngineDeadEr…
…ror (vllm-project#5963) Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for 6a62cb8 - Browse repository at this point
Copy the full SHA 6a62cb8View commit details -
[Kernel] Flashinfer for prefill & decode, with Cudagraph support for …
…decode (vllm-project#4628) Co-authored-by: LiuXiaoxuanPKU <llilyliupku@gmail.com>, bong-furiosa <bongwon.jang@furiosa.ai>
Configuration menu - View commit details
-
Copy full SHA for 7041de4 - Browse repository at this point
Copy the full SHA 7041de4View commit details
Commits on Jun 29, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 54814fd - Browse repository at this point
Copy the full SHA 54814fdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7f83f40 - Browse repository at this point
Copy the full SHA 7f83f40View commit details -
Configuration menu - View commit details
-
Copy full SHA for c4bca74 - Browse repository at this point
Copy the full SHA c4bca74View commit details -
[Misc] Extend vLLM Metrics logging API (vllm-project#5925)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for 906a19c - Browse repository at this point
Copy the full SHA 906a19cView commit details -
[Kernel] Add punica dimensions for Granite 3b and 8b (vllm-project#5930)
Signed-off-by: Joe Runde <joe@joerun.de>
Configuration menu - View commit details
-
Copy full SHA for ba49944 - Browse repository at this point
Copy the full SHA ba49944View commit details -
Configuration menu - View commit details
-
Copy full SHA for 580353d - Browse repository at this point
Copy the full SHA 580353dView commit details -
[Misc] Update Phi-3-Vision Example (vllm-project#5981)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 329df38 - Browse repository at this point
Copy the full SHA 329df38View commit details -
Configuration menu - View commit details
-
Copy full SHA for 51e971d - Browse repository at this point
Copy the full SHA 51e971dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c01f70 - Browse repository at this point
Copy the full SHA 7c01f70View commit details -
[Kernel] Raise an exception in MoE kernel if the batch size is larger…
… then 65k (vllm-project#5939)
Configuration menu - View commit details
-
Copy full SHA for f7dac83 - Browse repository at this point
Copy the full SHA f7dac83View commit details -
[ CI/Build ] Added E2E Test For Compressed Tensors (vllm-project#5839)
Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for 8dbfcd3 - Browse repository at this point
Copy the full SHA 8dbfcd3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 99397da - Browse repository at this point
Copy the full SHA 99397daView commit details -
[ CI/Build ] LM Eval Harness Based CI Testing (vllm-project#5838)
Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for 75aa144 - Browse repository at this point
Copy the full SHA 75aa144View commit details -
[Bugfix][CI/Build][Hardware][AMD] Install matching torchvision to fix…
… AMD tests (vllm-project#5949)
Configuration menu - View commit details
-
Copy full SHA for 9def106 - Browse repository at this point
Copy the full SHA 9def106View commit details
Commits on Jun 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for bcc6a09 - Browse repository at this point
Copy the full SHA bcc6a09View commit details -
Configuration menu - View commit details
-
Copy full SHA for cff6a1f - Browse repository at this point
Copy the full SHA cff6a1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d47f64 - Browse repository at this point
Copy the full SHA 9d47f64View commit details -
[ci][distributed] fix device count call
[ci][distributed] fix some cuda init that makes it necessary to use spawn (vllm-project#5991)
Configuration menu - View commit details
-
Copy full SHA for 2be6955 - Browse repository at this point
Copy the full SHA 2be6955View commit details -
[Frontend]: Support base64 embedding (vllm-project#5935)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for c6c240a - Browse repository at this point
Copy the full SHA c6c240aView commit details -
[Lora] Use safetensor keys instead of adapter_config.json to find une…
…xpected modules. (vllm-project#5909) Co-authored-by: sang <sangcho@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for f5e73c9 - Browse repository at this point
Copy the full SHA f5e73c9View commit details -
[ CI ] Temporarily Disable Large LM-Eval Tests (vllm-project#6005)
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for deacb7e - Browse repository at this point
Copy the full SHA deacb7eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7836fdc - Browse repository at this point
Copy the full SHA 7836fdcView commit details -
[ Misc ] Refactor w8a8 to use
process_weights_after_load
(Simplify ……Weight Loading) (vllm-project#5940) Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Configuration menu - View commit details
-
Copy full SHA for af9ad46 - Browse repository at this point
Copy the full SHA af9ad46View commit details
Commits on Jul 1, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 614aa51 - Browse repository at this point
Copy the full SHA 614aa51View commit details -
[Speculative Decoding 2/2 ] Integrate typical acceptance sampler into…
… Spec Decode Worker (vllm-project#5348)
Configuration menu - View commit details
-
Copy full SHA for 80ca1e6 - Browse repository at this point
Copy the full SHA 80ca1e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7076c89 - Browse repository at this point
Copy the full SHA 7076c89View commit details -
Configuration menu - View commit details
-
Copy full SHA for a3ac366 - Browse repository at this point
Copy the full SHA a3ac366View commit details -
Configuration menu - View commit details
-
Copy full SHA for 85af27e - Browse repository at this point
Copy the full SHA 85af27eView commit details -
Configuration menu - View commit details
-
Copy full SHA for f856a85 - Browse repository at this point
Copy the full SHA f856a85View commit details -
Configuration menu - View commit details
-
Copy full SHA for b1f8b71 - Browse repository at this point
Copy the full SHA b1f8b71View commit details -
Configuration menu - View commit details
-
Copy full SHA for fb74454 - Browse repository at this point
Copy the full SHA fb74454View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0e63941 - Browse repository at this point
Copy the full SHA 0e63941View commit details -
Configuration menu - View commit details
-
Copy full SHA for 463a8e6 - Browse repository at this point
Copy the full SHA 463a8e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0141d57 - Browse repository at this point
Copy the full SHA 0141d57View commit details -
Configuration menu - View commit details
-
Copy full SHA for 52fa486 - Browse repository at this point
Copy the full SHA 52fa486View commit details -
Configuration menu - View commit details
-
Copy full SHA for a21fe62 - Browse repository at this point
Copy the full SHA a21fe62View commit details -
Configuration menu - View commit details
-
Copy full SHA for aaf5446 - Browse repository at this point
Copy the full SHA aaf5446View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1ec95c4 - Browse repository at this point
Copy the full SHA 1ec95c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2394c41 - Browse repository at this point
Copy the full SHA 2394c41View commit details -
Configuration menu - View commit details
-
Copy full SHA for 98fb698 - Browse repository at this point
Copy the full SHA 98fb698View commit details -
Configuration menu - View commit details
-
Copy full SHA for d76084c - Browse repository at this point
Copy the full SHA d76084cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4050d64 - Browse repository at this point
Copy the full SHA 4050d64View commit details -
Configuration menu - View commit details
-
Copy full SHA for bb60326 - Browse repository at this point
Copy the full SHA bb60326View commit details -
[doc][misc] further lower visibility of simple api server (vllm-proje…
…ct#6041) Co-authored-by: Simon Mo <simon.mo@hey.com>
Configuration menu - View commit details
-
Copy full SHA for 8893130 - Browse repository at this point
Copy the full SHA 8893130View commit details -
[Bugfix] Use RayActorError for older versions of Ray in RayTokenizerG…
…roupPool (vllm-project#6039)
Configuration menu - View commit details
-
Copy full SHA for dec6fc6 - Browse repository at this point
Copy the full SHA dec6fc6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 12a5995 - Browse repository at this point
Copy the full SHA 12a5995View commit details -
Configuration menu - View commit details
-
Copy full SHA for 83bdcb6 - Browse repository at this point
Copy the full SHA 83bdcb6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e0817c - Browse repository at this point
Copy the full SHA 8e0817cView commit details -
Configuration menu - View commit details
-
Copy full SHA for c4059ea - Browse repository at this point
Copy the full SHA c4059eaView commit details -
Configuration menu - View commit details
-
Copy full SHA for c87ebc3 - Browse repository at this point
Copy the full SHA c87ebc3View commit details -
Configuration menu - View commit details
-
Copy full SHA for e373853 - Browse repository at this point
Copy the full SHA e373853View commit details -
[Model] Changes to MLPSpeculator to support tie_weights and input_sca…
…le (vllm-project#5965) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 5460070 - Browse repository at this point
Copy the full SHA 5460070View commit details
Commits on Jul 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 3476ed0 - Browse repository at this point
Copy the full SHA 3476ed0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2c37540 - Browse repository at this point
Copy the full SHA 2c37540View commit details -
[VLM] Remove
image_input_type
from VLM config (vllm-project#5852)Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com>
Configuration menu - View commit details
-
Copy full SHA for 98d6682 - Browse repository at this point
Copy the full SHA 98d6682View commit details -
Configuration menu - View commit details
-
Copy full SHA for c365082 - Browse repository at this point
Copy the full SHA c365082View commit details -
Configuration menu - View commit details
-
Copy full SHA for 31354e5 - Browse repository at this point
Copy the full SHA 31354e5View commit details -
Configuration menu - View commit details
-
Copy full SHA for aee6daf - Browse repository at this point
Copy the full SHA aee6dafView commit details -
Configuration menu - View commit details
-
Copy full SHA for d99d986 - Browse repository at this point
Copy the full SHA d99d986View commit details