Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion vllm/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,8 @@ def _verify_cache_dtype(self) -> None:
elif self.cache_dtype == "fp8":
if not is_hip():
nvcc_cuda_version = get_nvcc_cuda_version()
if nvcc_cuda_version < Version("11.8"):
if nvcc_cuda_version is not None \
and nvcc_cuda_version < Version("11.8"):
Comment on lines 327 to +329
Copy link
Contributor

@chiragjn chiragjn May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

Just curious, why is this check on nvcc and not on libcudart version in the first place?
E.g. torch.version.cuda or some other way?


I was wondering if I have cuda runtime 11.8 without nvcc installed this condition evaluates to False and no error would be raised.

For the docker image it would not matter because it has cuda 12.x

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to check cuda version on libcudart, as vllm-openai images only has cuda runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I fixed a bug for removing nvcc dependency: #4666

raise ValueError(
"FP8 is not supported when cuda version is"
"lower than 11.8.")
Expand Down