Skip to content

Commit 0a3940f

Browse files
committed
[MISC] Add prefix cache reset to LMCache CPU offload example
Signed-off-by: zhou.qianjun <zhou.qianjun@zte.com.cn>
1 parent f381cf2 commit 0a3940f

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

examples/others/lmcache/cpu_offload_lmcache.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,9 @@ def main():
144144
print_output(llm, first_prompt, sampling_params, "first")
145145

146146
time.sleep(1)
147+
# Clear vLLM's internal prefix cache to force the second request
148+
# to fetch cached KVs from LMCache
149+
llm.reset_prefix_cache()
147150

148151
# print the second output
149152
print_output(llm, second_prompt, sampling_params, "second")

0 commit comments

Comments
 (0)