Description
Your current environment
No response
Model Input Dumps
No response
🐛 Describe the bug
In the current implementation of MambaCacheManager._assign_seq_id_to_cache_index
, if cur_id
is not amongst the finished requests, it will try to pop a free_cache_index
.
- However, it seems there might be an edge case where the
_assign_seq_id_to_cache_index
tries to aggressively pop free indices before_release_finished_requests
has a change to return them
We have some private experiments involving mamba that we reuse the above MambaCacheManager
implementation, but we have observed errors like below
File "/net/storage149/mnt/md0/nmg/miniconda3/envs/vllm-mamba/lib/python3.10/site-packages/vllm/model_executor/models/jamba.py", line 441, in forward
) = self.mamba_cache.current_run_tensors(input_ids, attn_metadata,
File "/net/storage149/mnt/md0/nmg/miniconda3/envs/vllm-mamba/lib/python3.10/site-packages/vllm/model_executor/models/mamba_cache.py", line 54, in current_run_tensors
state_indices = self._prepare_current_run_mamba_cache(
File "/net/storage149/mnt/md0/nmg/miniconda3/envs/vllm-mamba/lib/python3.10/site-packages/vllm/model_executor/models/mamba_cache.py", line 144, in _prepare_current_run_mamba_cache
return [
File "/net/storage149/mnt/md0/nmg/miniconda3/envs/vllm-mamba/lib/python3.10/site-packages/vllm/model_executor/models/mamba_cache.py", line 145, in <listcomp>
self._assign_seq_id_to_cache_index(req_id, seq_id,
File "/net/storage149/mnt/md0/nmg/miniconda3/envs/vllm-mamba/lib/python3.10/site-packages/vllm/model_executor/models/mamba_cache.py", line 119, in _assign_seq_id_to_cache_index
destination_index = self.free_cache_indices.pop()
IndexError: pop from empty list
which suggests the issue being diagnosed above.
We have made sure that we initialize MambaCacheManager
will have max_batch_size
equal to scheduler_config.max_num_seqs
, which we have set it 10 times as large as our batch_size. We use around 8 scheduler steps.
Question: But how can we be sure that the cache occupancy will never exceed max_batch_size
?
CC: @nelsonspbr
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.