We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
vllm serve --load-format
kv_cache_dtype=fp8
fp8
Attention.kv_scale
k_scale
v_scale
vllm.__commit__