Skip to content

Commit

Permalink
[MISC] Keep chunked prefill enabled by default with long context when…
Browse files Browse the repository at this point in the history
… prefix caching is enabled (vllm-project#8342)
  • Loading branch information
comaniac authored Sep 10, 2024
1 parent e7e7445 commit 310cdfd
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion vllm/engine/arg_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -878,7 +878,6 @@ def create_engine_config(self) -> EngineConfig:
if (is_gpu and not use_sliding_window and not use_spec_decode
and not self.enable_lora
and not self.enable_prompt_adapter
and not self.enable_prefix_caching
and not has_seqlen_agnostic_layers):
self.enable_chunked_prefill = True
logger.warning(
Expand Down

0 comments on commit 310cdfd

Please sign in to comment.