Skip to content

Commit 03f63aa

Browse files
KingsleyZhang123jinghui
authored andcommitted
[v1][P/D] Fix a edge case in kv cache schedule (vllm-project#19182)
Co-authored-by: jinghui <jinghui@fb.com> Signed-off-by: minpeter <kali2005611@gmail.com>
1 parent bb7f9c8 commit 03f63aa

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm/v1/core/sched/scheduler.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1009,6 +1009,8 @@ def _update_waiting_for_remote_kv(self, request: Request) -> bool:
10091009
# Now that the blocks are ready, actually cache them.
10101010
block_ids = self.kv_cache_manager.get_block_ids(request.request_id)[0]
10111011
num_computed_tokens = len(block_ids) * self.block_size
1012+
# Handle the case where num request tokens less then one block.
1013+
num_computed_tokens = min(num_computed_tokens, request.num_tokens)
10121014
if num_computed_tokens == request.num_tokens:
10131015
num_computed_tokens -= 1
10141016
self.kv_cache_manager.cache_blocks(

0 commit comments

Comments
 (0)