Skip to content

Commit

Permalink
[KVCache] Initialize one extra page than specified (apache#16849)
Browse files Browse the repository at this point in the history
This PR udpates PagedKVCache to initialize one more page than
specified via constructor. The reason is that applications usually
depends the number of free pages (returned from `GetNumAvailablePages`)
to decide the KV cache operation policy. If there is no this extra
page, the KV cache will tell "no available" pages even when the
last allocated pages are not full, which may give the applications
an illusion that the KV cache is already completely full, and cause
further issues.
  • Loading branch information
MasterJH5574 authored Apr 7, 2024
1 parent a156181 commit a7be540
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions src/runtime/relax_vm/paged_kv_cache.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1790,7 +1790,7 @@ TVM_REGISTER_GLOBAL("vm.builtin.paged_attention_kv_cache_create")
int64_t prefill_chunk_size = cache_config[2];
int64_t page_size = cache_config[3];
bool support_sliding_window = cache_config[4];
int64_t num_total_pages = (total_token_capacity + page_size - 1) / page_size;
int64_t num_total_pages = (total_token_capacity + page_size - 1) / page_size + 1;
if (support_sliding_window) {
// When sliding window is enabled, each sequence may use two more pages at most.
num_total_pages += reserved_num_seqs * 2;
Expand Down Expand Up @@ -1827,7 +1827,7 @@ TVM_REGISTER_GLOBAL("vm.builtin.paged_attention_kv_cache_create_reduced")
int64_t prefill_chunk_size = cache_config[2];
int64_t page_size = cache_config[3];
bool support_sliding_window = cache_config[4];
int64_t num_total_pages = (total_token_capacity + page_size - 1) / page_size;
int64_t num_total_pages = (total_token_capacity + page_size - 1) / page_size + 1;
if (support_sliding_window) {
// When sliding window is enabled, each sequence may use two more pages at most.
num_total_pages += reserved_num_seqs * 2;
Expand Down

0 comments on commit a7be540

Please sign in to comment.