-
Notifications
You must be signed in to change notification settings - Fork 10
Commits on Jun 23, 2024
-
[CI/Build][Misc] Add CI that benchmarks vllm performance on those PRs…
… with `perf-benchmarks` label (vllm-project#5073) Co-authored-by: simon-mo <simon.mo@hey.com>
Configuration menu - View commit details
-
Copy full SHA for 5d52fa5 - Browse repository at this point
Copy the full SHA 5d52fa5View commit details -
Configuration menu - View commit details
-
Copy full SHA for cab4a5d - Browse repository at this point
Copy the full SHA cab4a5dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 923d05a - Browse repository at this point
Copy the full SHA 923d05aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 34467ee - Browse repository at this point
Copy the full SHA 34467eeView commit details -
[ Misc ] Rs/compressed tensors cleanup (vllm-project#5432)
Co-authored-by: mgoin <michael@neuralmagic.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for deee747 - Browse repository at this point
Copy the full SHA deee747View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0ccb117 - Browse repository at this point
Copy the full SHA 0ccb117View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4464401 - Browse repository at this point
Copy the full SHA 4464401View commit details -
Configuration menu - View commit details
-
Copy full SHA for 28d0d6d - Browse repository at this point
Copy the full SHA 28d0d6dView commit details -
Configuration menu - View commit details
-
Copy full SHA for f0e02ac - Browse repository at this point
Copy the full SHA f0e02acView commit details -
Configuration menu - View commit details
-
Copy full SHA for d0a3026 - Browse repository at this point
Copy the full SHA d0a3026View commit details -
Configuration menu - View commit details
-
Copy full SHA for 33edc9b - Browse repository at this point
Copy the full SHA 33edc9bView commit details -
[Bugfix] Enable loading FP8 checkpoints for gpt_bigcode models (vllm-…
…project#5460) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 5fffeb8 - Browse repository at this point
Copy the full SHA 5fffeb8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 65419f4 - Browse repository at this point
Copy the full SHA 65419f4View commit details -
Configuration menu - View commit details
-
Copy full SHA for dfd2b2e - Browse repository at this point
Copy the full SHA dfd2b2eView commit details -
Configuration menu - View commit details
-
Copy full SHA for d464106 - Browse repository at this point
Copy the full SHA d464106View commit details -
[Core][Bugfix]: fix prefix caching for blockv2 (vllm-project#5364)
Signed-off-by: Lei Wen <wenlei03@qiyi.com> Co-authored-by: Lei Wen <wenlei03@qiyi.com>
Configuration menu - View commit details
-
Copy full SHA for 80b908f - Browse repository at this point
Copy the full SHA 80b908fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0393d45 - Browse repository at this point
Copy the full SHA 0393d45View commit details -
Configuration menu - View commit details
-
Copy full SHA for 32d5ecc - Browse repository at this point
Copy the full SHA 32d5eccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f3169a - Browse repository at this point
Copy the full SHA 6f3169aView commit details -
Configuration menu - View commit details
-
Copy full SHA for beb3b21 - Browse repository at this point
Copy the full SHA beb3b21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 31f38f3 - Browse repository at this point
Copy the full SHA 31f38f3View commit details -
Configuration menu - View commit details
-
Copy full SHA for ec68cd1 - Browse repository at this point
Copy the full SHA ec68cd1View commit details -
Configuration menu - View commit details
-
Copy full SHA for dc8789d - Browse repository at this point
Copy the full SHA dc8789dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 681de21 - Browse repository at this point
Copy the full SHA 681de21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 77a5f36 - Browse repository at this point
Copy the full SHA 77a5f36View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9c77244 - Browse repository at this point
Copy the full SHA 9c77244View commit details -
Configuration menu - View commit details
-
Copy full SHA for f968328 - Browse repository at this point
Copy the full SHA f968328View commit details -
Configuration menu - View commit details
-
Copy full SHA for b0abad9 - Browse repository at this point
Copy the full SHA b0abad9View commit details -
Correct alignment in the seq_len diagram. (vllm-project#5592)
Co-authored-by: Liqian Chen <liqian.chen@deeplang.ai>
Configuration menu - View commit details
-
Copy full SHA for 4b84959 - Browse repository at this point
Copy the full SHA 4b84959View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9cfb1d7 - Browse repository at this point
Copy the full SHA 9cfb1d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 61f421b - Browse repository at this point
Copy the full SHA 61f421bView commit details -
[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (vllm-proj…
…ect#3814) Co-authored-by: Jiang Li <jiang1.li@intel.com> Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for dceff94 - Browse repository at this point
Copy the full SHA dceff94View commit details -
Configuration menu - View commit details
-
Copy full SHA for e830048 - Browse repository at this point
Copy the full SHA e830048View commit details -
[CI] the readability of benchmarking and prepare for dashboard (vllm-…
…project#5571) [CI] Improve the readability of performance benchmarking results and prepare for upcoming performance dashboard (vllm-project#5571)
Configuration menu - View commit details
-
Copy full SHA for a212392 - Browse repository at this point
Copy the full SHA a212392View commit details -
Configuration menu - View commit details
-
Copy full SHA for bc2be04 - Browse repository at this point
Copy the full SHA bc2be04View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5eb3526 - Browse repository at this point
Copy the full SHA 5eb3526View commit details -
Configuration menu - View commit details
-
Copy full SHA for 17fd0ba - Browse repository at this point
Copy the full SHA 17fd0baView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7a58e54 - Browse repository at this point
Copy the full SHA 7a58e54View commit details -
[Speculative Decoding 1/2 ] Add typical acceptance sampling as one of…
… the sampling techniques in the verifier (vllm-project#5131)
Configuration menu - View commit details
-
Copy full SHA for dbf0e91 - Browse repository at this point
Copy the full SHA dbf0e91View commit details -
Configuration menu - View commit details
-
Copy full SHA for 18c566f - Browse repository at this point
Copy the full SHA 18c566fView commit details -
[Kernel] Add punica dimensions for Granite 13b (vllm-project#5559)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 69fa6ed - Browse repository at this point
Copy the full SHA 69fa6edView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b39fc2 - Browse repository at this point
Copy the full SHA 1b39fc2View commit details -
Configuration menu - View commit details
-
Copy full SHA for f691b45 - Browse repository at this point
Copy the full SHA f691b45View commit details -
[CI] Avoid naming different metrics with the same name in performance…
… benchmark (vllm-project#5615)
Configuration menu - View commit details
-
Copy full SHA for 5abb0c8 - Browse repository at this point
Copy the full SHA 5abb0c8View commit details -
[bugfix][distributed] improve p2p capability test (vllm-project#5612)
[bugfix][distributed] do not error if two processes do not agree on p2p capability (vllm-project#5612)
Configuration menu - View commit details
-
Copy full SHA for f355997 - Browse repository at this point
Copy the full SHA f355997View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1343cd0 - Browse repository at this point
Copy the full SHA 1343cd0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 021cfdb - Browse repository at this point
Copy the full SHA 021cfdbView commit details -
[ci] Deprecate original CI template (vllm-project#5624)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 70baf49 - Browse repository at this point
Copy the full SHA 70baf49View commit details -
[Misc] Add OpenTelemetry support (vllm-project#4687)
This PR adds basic support for OpenTelemetry distributed tracing. It includes changes to enable tracing functionality and improve monitoring capabilities. I've also added a markdown with print-screens to guide users how to use this feature. You can find it here
Configuration menu - View commit details
-
Copy full SHA for be2f123 - Browse repository at this point
Copy the full SHA be2f123View commit details -
[Misc] Add channel-wise quantization support for w8a8 dynamic per tok…
…en activation quantization (vllm-project#5542)
Configuration menu - View commit details
-
Copy full SHA for 0008715 - Browse repository at this point
Copy the full SHA 0008715View commit details -
[ci] Setup Release pipeline and build release wheels with cache (vllm…
…-project#5610) Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 14a7620 - Browse repository at this point
Copy the full SHA 14a7620View commit details -
Configuration menu - View commit details
-
Copy full SHA for 50c2ca9 - Browse repository at this point
Copy the full SHA 50c2ca9View commit details -
[Bugfix] Fix for inconsistent behaviour related to sampling and repet…
…ition penalties (vllm-project#5639) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 3d24777 - Browse repository at this point
Copy the full SHA 3d24777View commit details -
Configuration menu - View commit details
-
Copy full SHA for c5ef2f9 - Browse repository at this point
Copy the full SHA c5ef2f9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 010f2e8 - Browse repository at this point
Copy the full SHA 010f2e8View commit details -
Configuration menu - View commit details
-
Copy full SHA for a8b75a4 - Browse repository at this point
Copy the full SHA a8b75a4View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0d8ed2 - Browse repository at this point
Copy the full SHA a0d8ed2View commit details -
[Bugfix] Added test for sampling repetition penalty bug. (vllm-projec…
…t#5659) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for e4d2b6e - Browse repository at this point
Copy the full SHA e4d2b6eView commit details -
[Bugfix][CI/Build][AMD][ROCm]Fixed the cmake build bug which generate…
… garbage on certain devices (vllm-project#5641)
Configuration menu - View commit details
-
Copy full SHA for cb46cfe - Browse repository at this point
Copy the full SHA cb46cfeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8f72d50 - Browse repository at this point
Copy the full SHA 8f72d50View commit details -
Configuration menu - View commit details
-
Copy full SHA for b081ff9 - Browse repository at this point
Copy the full SHA b081ff9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 784aa72 - Browse repository at this point
Copy the full SHA 784aa72View commit details -
Configuration menu - View commit details
-
Copy full SHA for a799171 - Browse repository at this point
Copy the full SHA a799171View commit details -
Configuration menu - View commit details
-
Copy full SHA for 436aaf9 - Browse repository at this point
Copy the full SHA 436aaf9View commit details -
[ci] Add A100 queue into AWS CI template (vllm-project#5648)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for d33025c - Browse repository at this point
Copy the full SHA d33025cView commit details -
[Frontend][Bugfix] Fix preemption_mode -> preemption-mode for CLI arg…
… in arg_utils.py (vllm-project#5688)
Configuration menu - View commit details
-
Copy full SHA for cf5889f - Browse repository at this point
Copy the full SHA cf5889fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 88396ae - Browse repository at this point
Copy the full SHA 88396aeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8ff473a - Browse repository at this point
Copy the full SHA 8ff473aView commit details -
[Doc] Update docker references (vllm-project#5614)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 4f4cea6 - Browse repository at this point
Copy the full SHA 4f4cea6View commit details -
[Misc] Add per channel support for static activation quantization; up…
…date w8a8 schemes to share base classes (vllm-project#5650)
Configuration menu - View commit details
-
Copy full SHA for 0e8e31e - Browse repository at this point
Copy the full SHA 0e8e31eView commit details -
[ci] Limit num gpus if specified for A100 (vllm-project#5694)
Signed-off-by: kevin <kevin@anyscale.com>
Configuration menu - View commit details
-
Copy full SHA for 1ccd388 - Browse repository at this point
Copy the full SHA 1ccd388View commit details -
Configuration menu - View commit details
-
Copy full SHA for 330aa1b - Browse repository at this point
Copy the full SHA 330aa1bView commit details -
Configuration menu - View commit details
-
Copy full SHA for df3ae01 - Browse repository at this point
Copy the full SHA df3ae01View commit details -
[Kernel] Update Cutlass int8 kernel configs for SM90 (vllm-project#5514)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for 7d85753 - Browse repository at this point
Copy the full SHA 7d85753View commit details -
Configuration menu - View commit details
-
Copy full SHA for b6ec1d5 - Browse repository at this point
Copy the full SHA b6ec1d5View commit details -
[Kernel] Update Cutlass int8 kernel configs for SM80 (vllm-project#5275)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for db7892d - Browse repository at this point
Copy the full SHA db7892dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 51dfab0 - Browse repository at this point
Copy the full SHA 51dfab0View commit details -
[Frontend] Add FlexibleArgumentParser to support both underscore and …
…dash in names (vllm-project#5718)
Configuration menu - View commit details
-
Copy full SHA for c477239 - Browse repository at this point
Copy the full SHA c477239View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5ccb86c - Browse repository at this point
Copy the full SHA 5ccb86cView commit details -
[Model] MLPSpeculator speculative decoding support (vllm-project#4947)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Davis Wertheimer <Davis.Wertheimer@ibm.com>
Configuration menu - View commit details
-
Copy full SHA for b05443a - Browse repository at this point
Copy the full SHA b05443aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1996acf - Browse repository at this point
Copy the full SHA 1996acfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1699d33 - Browse repository at this point
Copy the full SHA 1699d33View commit details -
[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora (vllm-…
…project#5665) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for e4f1a4e - Browse repository at this point
Copy the full SHA e4f1a4eView commit details -
[Core][Distributed] add shm broadcast (vllm-project#5399)
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 3e3c8d9 - Browse repository at this point
Copy the full SHA 3e3c8d9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 01369a0 - Browse repository at this point
Copy the full SHA 01369a0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 733cf30 - Browse repository at this point
Copy the full SHA 733cf30View commit details -
Configuration menu - View commit details
-
Copy full SHA for 07cd29d - Browse repository at this point
Copy the full SHA 07cd29dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e2140f - Browse repository at this point
Copy the full SHA 2e2140fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0bec3f6 - Browse repository at this point
Copy the full SHA 0bec3f6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3595200 - Browse repository at this point
Copy the full SHA 3595200View commit details -
[Model] Support Qwen-VL and Qwen-VL-Chat models with text-only inputs (…
…vllm-project#5710) Co-authored-by: Roger Wang <ywang@roblox.com>
Configuration menu - View commit details
-
Copy full SHA for 1a6c6dd - Browse repository at this point
Copy the full SHA 1a6c6ddView commit details -
[Misc] Remove vllm-project#4789 workaround left in vllm/entrypoints/o…
…penai/run_batch.py (vllm-project#5756)
Configuration menu - View commit details
-
Copy full SHA for a7dccd6 - Browse repository at this point
Copy the full SHA a7dccd6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 960a022 - Browse repository at this point
Copy the full SHA 960a022View commit details -
Configuration menu - View commit details
-
Copy full SHA for dc211cd - Browse repository at this point
Copy the full SHA dc211cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 860a1d6 - Browse repository at this point
Copy the full SHA 860a1d6View commit details -
[BugFix] [Kernel] Add Cutlass2x fallback kernels (vllm-project#5744)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Configuration menu - View commit details
-
Copy full SHA for d7f0ece - Browse repository at this point
Copy the full SHA d7f0eceView commit details -
Configuration menu - View commit details
-
Copy full SHA for e484da4 - Browse repository at this point
Copy the full SHA e484da4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 683f309 - Browse repository at this point
Copy the full SHA 683f309View commit details
Commits on Jun 24, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 3b3a92c - Browse repository at this point
Copy the full SHA 3b3a92cView commit details -
Configuration menu - View commit details
-
Copy full SHA for e0c0530 - Browse repository at this point
Copy the full SHA e0c0530View commit details -
Configuration menu - View commit details
-
Copy full SHA for 01d4f34 - Browse repository at this point
Copy the full SHA 01d4f34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 616fce8 - Browse repository at this point
Copy the full SHA 616fce8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3c5a7f5 - Browse repository at this point
Copy the full SHA 3c5a7f5View commit details -
4
Configuration menu - View commit details
-
Copy full SHA for 71f60a8 - Browse repository at this point
Copy the full SHA 71f60a8View commit details -
Configuration menu - View commit details
-
Copy full SHA for e960ebb - Browse repository at this point
Copy the full SHA e960ebbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3297247 - Browse repository at this point
Copy the full SHA 3297247View commit details -
Configuration menu - View commit details
-
Copy full SHA for dcdf4da - Browse repository at this point
Copy the full SHA dcdf4daView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0dd1848 - Browse repository at this point
Copy the full SHA 0dd1848View commit details -
Merge branch 'upstream-sync-2024-06-23' of https://github.com/neuralm…
…agic/nm-vllm into upstream-sync-2024-06-23
8Configuration menu - View commit details
-
Copy full SHA for de06faa - Browse repository at this point
Copy the full SHA de06faaView commit details
Commits on Jun 25, 2024
-
Configuration menu - View commit details
-
Copy full SHA for cdf52bf - Browse repository at this point
Copy the full SHA cdf52bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for f2d2794 - Browse repository at this point
Copy the full SHA f2d2794View commit details -
2
Configuration menu - View commit details
-
Copy full SHA for 431054d - Browse repository at this point
Copy the full SHA 431054dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d7b7b5 - Browse repository at this point
Copy the full SHA 9d7b7b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3a75e15 - Browse repository at this point
Copy the full SHA 3a75e15View commit details -
2
Configuration menu - View commit details
-
Copy full SHA for c9d1b9e - Browse repository at this point
Copy the full SHA c9d1b9eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 973d9d0 - Browse repository at this point
Copy the full SHA 973d9d0View commit details -
Configuration menu - View commit details
-
Copy full SHA for c44802e - Browse repository at this point
Copy the full SHA c44802eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9c15fe1 - Browse repository at this point
Copy the full SHA 9c15fe1View commit details -
3
Configuration menu - View commit details
-
Copy full SHA for 727077f - Browse repository at this point
Copy the full SHA 727077fView commit details