Pre-built onnxruntime-gpu wheel with native CUDA kernels for NVIDIA Blackwell GPUs (RTX 5090, 5080, 5070 Ti, 5070).
The official PyPI onnxruntime-gpu package does not include sm_120 kernels, so CUDAExecutionProvider is unavailable on Blackwell cards and all operations fall back to CPU.
Grab the .whl from the Releases page.
pip install onnxruntime_gpu-1.24.1-cp312-cp312-win_amd64.whlOr directly from the release:
pip install https://github.com/Natfii/onnxruntime-gpu-blackwell/releases/download/v1.24.1/onnxruntime_gpu-1.24.1-cp312-cp312-win_amd64.whlimport onnxruntime as ort
print(ort.__version__) # 1.24.1
print(ort.get_available_providers()) # ['CUDAExecutionProvider', 'CPUExecutionProvider']| onnxruntime | 1.24.1 |
| CUDA | 13.1 |
| cuDNN | 9.19.0.56 |
| CUDA arch | sm_120 (Blackwell) |
| Python | 3.12 (CPython) |
| Platform | Windows x86_64 |
| Compiler | MSVC 14.44 (VS 2022 17.x) |
| Generator | Ninja |
Built from the official onnxruntime source with CMAKE_CUDA_ARCHITECTURES=120.
- NVIDIA GPU driver 591+ (Blackwell support)
- CUDA Toolkit 13.1 runtime DLLs on PATH
- cuDNN 9.x for CUDA 13
As of February 2026, the official onnxruntime-gpu pip package ships kernels up to sm_89/sm_90 (Ada Lovelace / Hopper). Blackwell (sm_120) is not yet supported in the prebuilt wheels, so CUDAExecutionProvider is not available and all inference falls back to CPU.
This wheel enables CUDAExecutionProvider on Blackwell by including natively compiled sm_120 kernels.
Even with sm_120 kernels, some ONNX models (e.g. Kokoro TTS) will still log warnings like:
OP Conv(...) running in Fallback mode. May be extremely slow.
This is a cuDNN algorithm selection issue, not a CUDA architecture issue. cuDNN 9.x does not yet have optimized Conv algorithms for certain kernel shapes on Blackwell. The Conv ops still run on GPU (not CPU) — just using a slower generic cuDNN codepath. In practice the performance impact is minor for small models.
To suppress the warning spam, set the ONNX Runtime log severity to ERROR:
sess_opts = ort.SessionOptions()
sess_opts.log_severity_level = 3 # ERROR onlyONNX Runtime is licensed under the MIT License.