Skip to content

[Feature]: Support kv-cache-dtype fp8 without nvcc #4666

@samos123

Description

@samos123

🚀 The feature, motivation and pitch

Currently the following error is thrown if nvcc is not installed:

WARNING 05-08 01:34:59 utils.py:313] Not found nvcc in /usr/local/cuda. Skip cuda version check!
Traceback (most recent call last):                                                                              
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,                                                                  
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)                                                                                     
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 159, in <module>
    engine = AsyncLLMEngine.from_engine_args(                                                                   
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 341, in from_engine_args
    engine_config = engine_args.create_engine_config()
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 471, in create_engine_config
    cache_config = CacheConfig(self.block_size,
  File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 310, in __init__
    self._verify_cache_dtype()
  File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 333, in _verify_cache_dtype
    if nvcc_cuda_version < Version("11.8"):
TypeError: '<' not supported between instances of 'NoneType' and 'Version'

However, we should be able to do a version check without depending on nvcc. So let's remove this dependency.

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestNew feature or requeststaleOver 90 days of inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions