Skip to content

Add miscellaneous updates #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 13, 2023
Merged

Add miscellaneous updates #8

merged 6 commits into from
Mar 13, 2023

Conversation

WoosukKwon
Copy link
Collaborator

This PR contains several miscellaneous updates to the system, with two notable changes:

  1. The size of the CPU KV cache is now calculated based on the swap_space size provided by the user (defaulting to 20 GiB).
  2. The default value for max_num_batched_tokens has been increased from 2048 to 2560.

@WoosukKwon WoosukKwon merged commit cfae35b into main Mar 13, 2023
@WoosukKwon WoosukKwon deleted the minor branch March 13, 2023 20:48
v1nc3nt27 pushed a commit to v1nc3nt27/vllm that referenced this pull request Sep 12, 2023
xiangyuT pushed a commit to xiangyuT/vllm that referenced this pull request Oct 24, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
mzusman added a commit to mzusman/vllm that referenced this pull request Apr 16, 2024
* Return support for other models apart from jamba

* Support n>1

* A little cleanup

* Rename

* Apply whitespace suggestions from code review

* Add max batch size to the main func

* Fixed attention kv cache bug

* log where requests id are deleted from the dict to debug mode

* Fix typo

* Align with v0.3.3 vllm code

* Remove comments

* Take out model config from CUDAGraph object

* Fix

* Fix typo

* Make the kv cache selection cleaner

* Another typo

* Took the num layers calc outside

* Remove the -1

* Set as num layer / period

---------

Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
sfc-gh-hazhang pushed a commit to sfc-gh-hazhang/vllm that referenced this pull request May 7, 2024
ykim362 pushed a commit to ykim362/vllm that referenced this pull request Jun 17, 2024
…128k

Support Phi3SuScaledRotaryEmbedding for 128k model
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
zeroorhero pushed a commit to zeroorhero/vllm that referenced this pull request Sep 23, 2024
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm Apr 22, 2025
- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 3, 2025
* [Update] LMcache connector v1 implementation

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] examples for disaggregated prefill

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [add] extra information about evns

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* Initial stubs for P/D scheduling changes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Updates

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Rs branch (#3)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Rs branch (#5)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Remove Unneeded Arguments (#7)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Improve disagg-example.sh (#8)

- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added connector

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* update

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* remove

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* seems to load properly

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Revert "updated"

This reverts commit 97316d9.

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* diffs for local dev on macos

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updaed

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Checkpoint.

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated on scheduler side

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Hacking away

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* ensure request removed from running list

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Runs E2E. Garbage output. Crashes on 2nd request

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* rename files

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Second request no longer crashes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Remove gpu_model_runner hacks

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Clean up Justfile

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* justfile edits

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes - lm_eval gsm8k has correctness

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* "just delete the assert"

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* fixup precommit issues

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated (#12)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Add Accuracy Test (#13)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Preemption Bugfixes (#15)

* stash fixed double free issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fixed issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated (#16)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Fix Bad Merge | Fix Memory Leak in Upstream (vllm-project#18)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* clean up justfile, examples

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* More cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup, precommit fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* More cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* run_accuracy_test.sh UX

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* squash warnings

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* pre-commit

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Add get_finished to base kv connector

Signed-off-by: mgoin <mgoin64@gmail.com>

* revert test.txt

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* review comments

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 4, 2025
* [Update] LMcache connector v1 implementation

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] examples for disaggregated prefill

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [add] extra information about evns

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* Initial stubs for P/D scheduling changes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Updates

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Rs branch (#3)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Rs branch (#5)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Remove Unneeded Arguments (#7)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Improve disagg-example.sh (#8)

- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added connector

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* update

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* remove

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* seems to load properly

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Revert "updated"

This reverts commit 97316d9.

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* diffs for local dev on macos

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updaed

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Checkpoint.

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated on scheduler side

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Hacking away

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* ensure request removed from running list

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Runs E2E. Garbage output. Crashes on 2nd request

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* rename files

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Second request no longer crashes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Remove gpu_model_runner hacks

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Clean up Justfile

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* justfile edits

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes - lm_eval gsm8k has correctness

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* "just delete the assert"

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* fixup precommit issues

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated (#12)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Add Accuracy Test (#13)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Preemption Bugfixes (#15)

* stash fixed double free issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fixed issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated (#16)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Fix Bad Merge | Fix Memory Leak in Upstream (vllm-project#18)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatted

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* revert

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* more spurious changes

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>

* Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 6, 2025
* [Update] LMcache connector v1 implementation

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] examples for disaggregated prefill

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [add] extra information about evns

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* Initial stubs for P/D scheduling changes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Updates

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Rs branch (#3)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Rs branch (#5)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Remove Unneeded Arguments (#7)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Improve disagg-example.sh (#8)

- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added connector

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* update

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* remove

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* seems to load properly

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Revert "updated"

This reverts commit 97316d9.

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* diffs for local dev on macos

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updaed

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Checkpoint.

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated on scheduler side

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Hacking away

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* ensure request removed from running list

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Runs E2E. Garbage output. Crashes on 2nd request

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* rename files

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Second request no longer crashes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Remove gpu_model_runner hacks

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Clean up Justfile

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* justfile edits

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes - lm_eval gsm8k has correctness

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* "just delete the assert"

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* fixup precommit issues

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated (#12)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Add Accuracy Test (#13)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Preemption Bugfixes (#15)

* stash fixed double free issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fixed issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated (#16)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Fix Bad Merge | Fix Memory Leak in Upstream (vllm-project#18)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatted

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* revert

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* more spurious changes

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Support MLA in NIXL connector

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP adding tests

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* wip

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
richardsliu pushed a commit to richardsliu/vllm that referenced this pull request May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant