Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix / Core] Prefix Caching Guards (merged with main) #4846

Merged
merged 32 commits into from
May 27, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
56680e7
added guards for prefix-caching. added ability to disable sliding window
robertgshaw2-neuralmagic Apr 7, 2024
64aac2e
format.sh
robertgshaw2-neuralmagic Apr 7, 2024
28ae0cc
added tests
robertgshaw2-neuralmagic Apr 7, 2024
2a01ae6
Merge remote-tracking branch 'upstream/main' into prefix-caching-guards
robertgshaw2-neuralmagic Apr 28, 2024
cd0f666
merge
robertgshaw2-neuralmagic Apr 28, 2024
f30c3de
removed images
robertgshaw2-neuralmagic Apr 28, 2024
da5a982
fixed bad merge
robertgshaw2-neuralmagic Apr 28, 2024
8502b6a
./format
robertgshaw2-neuralmagic Apr 28, 2024
1bef541
validated that prefix caching working on turing with recent update
robertgshaw2-neuralmagic Apr 28, 2024
6620e53
Merge branch 'main' into prefix-caching-guards
zhuohan123 May 16, 2024
8efc774
Merge branch 'main' into prefix-caching-guards-new
robertgshaw2-neuralmagic May 23, 2024
033c2c5
updated to remove sliding window usage in models
robertgshaw2-neuralmagic May 23, 2024
0638960
removed spurious changes
robertgshaw2-neuralmagic May 24, 2024
63c0097
revert change to make PR easier to read
robertgshaw2-neuralmagic May 24, 2024
4a3630c
more cleanup
robertgshaw2-neuralmagic May 24, 2024
3f73426
more cleanup
robertgshaw2-neuralmagic May 24, 2024
6f754c3
stash
robertgshaw2-neuralmagic May 24, 2024
a497b7b
cleanup prints and comments to match current for easier review
robertgshaw2-neuralmagic May 24, 2024
9fd64fe
more cleanup for PR readibility
robertgshaw2-neuralmagic May 24, 2024
1126c5a
more cleanup for PR readibility
robertgshaw2-neuralmagic May 24, 2024
8a53180
more cleanup for PR readibility
robertgshaw2-neuralmagic May 24, 2024
034bbde
removed from mixtral, need to fix qwen
robertgshaw2-neuralmagic May 24, 2024
b56352b
updated models to remove sliding window. Big update to qwen to preven…
robertgshaw2-neuralmagic May 24, 2024
8225c3f
format
robertgshaw2-neuralmagic May 24, 2024
13797c1
added test and fixed requirements dev
robertgshaw2-neuralmagic May 24, 2024
37efe98
added test
robertgshaw2-neuralmagic May 24, 2024
7b186c2
format
robertgshaw2-neuralmagic May 27, 2024
93bce37
updated comment
robertgshaw2-neuralmagic May 27, 2024
7c8a9d0
updated test
robertgshaw2-neuralmagic May 27, 2024
4285763
format
robertgshaw2-neuralmagic May 27, 2024
84253fd
fixed logging
robertgshaw2-neuralmagic May 27, 2024
7a61f51
Update test_disable_sliding_window.py
robertgshaw2-neuralmagic May 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion tests/test_config.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
import pytest

from vllm.config import ModelConfig

MODEL_IDS_EXPECTED = [
("Qwen/Qwen1.5-7B", 32768),
("mistralai/Mistral-7B-v0.1", 4096),
("mistralai/Mistral-7B-Instruct-v0.2", 32768),
]


@pytest.mark.parametrize("model_id_expected", MODEL_IDS_EXPECTED)
def test_disable_sliding_window(model_id_expected):
model_id, expected = model_id_expected
model_config = ModelConfig(
model_id,
model_id,
tokenizer_mode="auto",
trust_remote_code=False,
seed=0,
dtype="float16",
revision=None,
disable_sliding_window=True,
)
assert model_config.max_model_len == expected


def test_get_sliding_window():
TEST_SLIDING_WINDOW = 4096
Expand Down Expand Up @@ -36,4 +60,4 @@ def test_get_sliding_window():
assert mistral_model_config.get_sliding_window() is None

mistral_model_config.hf_config.sliding_window = TEST_SLIDING_WINDOW
assert mistral_model_config.get_sliding_window() == TEST_SLIDING_WINDOW
assert mistral_model_config.get_sliding_window() == TEST_SLIDING_WINDOW
Loading
Loading