Skip to content

fix: replace numeric_limits::infinity() with literal to fix CUDA 12.8 NVRTC compilation#327

Open
kuishou68 wants to merge 1 commit into
deepseek-ai:mainfrom
kuishou68:fix-cuda128-nvrtc-constexpr
Open

fix: replace numeric_limits::infinity() with literal to fix CUDA 12.8 NVRTC compilation#327
kuishou68 wants to merge 1 commit into
deepseek-ai:mainfrom
kuishou68:fix-cuda128-nvrtc-constexpr

Conversation

@kuishou68
Copy link
Copy Markdown

Problem

On CUDA 12.8, NVRTC fails to compile smxx_clean_logits.cuh with:

error: expression must have a constant value
constexpr float neg_inf = -cute::numeric_limits<float>::infinity();

cuda::std::numeric_limits<T>::infinity() is not constexpr in CUDA 12.8 NVRTC environment.

Solution

Replace with raw float literal -1e38f, which is a true compile-time constant and mathematically equivalent to negative infinity for softmax masking purposes.

Testing

  • Fixes JIT compilation on CUDA 12.8 (H100)
  • Preserves numerical correctness for attention score masking
  • Compatible with both float32 and bfloat16 logits types

Closes #295

… NVRTC compilation

- cute::numeric_limits<T>::infinity() is not constexpr in CUDA 12.8 NVRTC
- Using -1e38f literal as negative infinity workaround (mathematically equivalent for softmax masking)
- Fixes JIT compilation failure in smxx_clean_logits kernel on CUDA 12.8

Closes deepseek-ai#295
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: NVRTC JIT compilation fails on CUDA 12.8 (smxx_clean_logits.cuh: expression must have a constant value)

1 participant