Skip to content

Commit 95f3845

Browse files
Yard1Alvant
authored andcommitted
[Bugfix] Add explicit end_forward calls to flashinfer (vllm-project#6044)
Signed-off-by: Alvant <alvasian@yandex.ru>
1 parent 79d7d66 commit 95f3845

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm/attention/backends/flashinfer.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,7 @@ def begin_forward(self):
126126
self.paged_kv_indptr = self.paged_kv_indptr.to(self.device)
127127
self.paged_kv_last_page_len = self.paged_kv_last_page_len.to(
128128
self.device)
129+
self.prefill_wrapper.end_forward()
129130
self.prefill_wrapper.begin_forward(
130131
self.query_start_loc, self.paged_kv_indptr,
131132
self.paged_kv_indices, self.paged_kv_last_page_len,
@@ -142,6 +143,7 @@ def begin_forward(self):
142143
self.device)
143144

144145
assert self.decode_wrapper is not None
146+
self.decode_wrapper.end_forward()
145147
self.decode_wrapper.begin_forward(
146148
self.paged_kv_indptr,
147149
self.paged_kv_indices,

0 commit comments

Comments
 (0)