Skip to content

Releases: dougeeai/llama-cpp-python-wheels

llama-cpp-python 0.3.16 + CUDA 12.1 sm86 Ampere - Python 3.11 - Windows x64

03 Nov 03:39
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 12.1 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.11.x (exact version required)
  • CUDA: 12.1 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 525.60.13 or higher
    GPU: NVIDIA RTX 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti, RTX A2000, RTX A4000, RTX A4500, RTX A5000, RTX A5500, RTX A6000
  • Architecture: Ampere (sm_86)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda12.1.sm86.ampere-cp311-cp311-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11 with RTX 3090 Ti
  • Tested on: RTX 3090 Ti, RTX 3060 Ti
  • Build date: November 1, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=86

Keywords

llama-cpp-python, CUDA 12.1, Python 3.11, Windows, RTX 3090, RTX 3080, RTX 3070, RTX 3060, RTX A6000, RTX A5000, RTX A4500, RTX A4000, RTX A2000, Ampere, sm86, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 12.1 sm86 Ampere - Python 3.10 - Windows x64

03 Nov 03:40
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 12.1 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.10.x (exact version required)
  • CUDA: 12.1 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 525.60.13 or higher
  • GPU: NVIDIA RTX 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti, RTX A2000, RTX A4000, RTX A4500, RTX A5000, RTX A5500, RTX A6000
  • Architecture: Ampere (sm_86)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda12.1.sm86.ampere-cp310-cp310-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11 with RTX 3090 Ti
  • Tested on: RTX 3090 Ti, RTX 3060 Ti
  • Build date: November 1, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=86

Keywords

llama-cpp-python, CUDA 12.1, Python 3.10, Windows, RTX 3090, RTX 3080, RTX 3070, RTX 3060, RTX A6000, RTX A5000, RTX A4500, RTX A4000, RTX A2000, Ampere, sm86, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm89 Ada Lovelace - Python 3.13 - Windows x64

03 Nov 03:59
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.13.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
    GPU: NVIDIA RTX 4060, 4060 Ti, 4070, 4070 Ti, 4070 Ti Super, 4080, 4080 Super, 4090, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, RTX 4500 Ada, RTX 4000 Ada, RTX 4000 SFF Ada, L40, L40S
  • Architecture: Ada Lovelace (sm_89)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm89.ada-cp313-cp313-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11
  • Tested on: RTX A6000 Ada, RTX 5000 Ada
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=89

Keywords

llama-cpp-python, CUDA 11.8, Python 3.13, Windows, RTX 4090, RTX 4080, RTX 4070, RTX 4060, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, L40, L40S, Ada Lovelace, sm89, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm89 Ada Lovelace - Python 3.12 - Windows x64

03 Nov 04:01
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.12.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
  • GPU: NVIDIA RTX 4060, 4060 Ti, 4070, 4070 Ti, 4070 Ti Super, 4080, 4080 Super, 4090, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, RTX 4500 Ada, RTX 4000 Ada, RTX 4000 SFF Ada, L40, L40S
  • Architecture: Ada Lovelace (sm_89)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm89.ada-cp312-cp312-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11
  • Tested on: RTX A6000 Ada, RTX 5000 Ada
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=89

Keywords

llama-cpp-python, CUDA 11.8, Python 3.12, Windows, RTX 4090, RTX 4080, RTX 4070, RTX 4060, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, L40, L40S, Ada Lovelace, sm89, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm89 Ada Lovelace - Python 3.11 - Windows x64

03 Nov 04:01
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.11.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
  • GPU: NVIDIA RTX 4060, 4060 Ti, 4070, 4070 Ti, 4070 Ti Super, 4080, 4080 Super, 4090, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, RTX 4500 Ada, RTX 4000 Ada, RTX 4000 SFF Ada, L40, L40S
  • Architecture: Ada Lovelace (sm_89)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm89.ada-cp311-cp311-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11
  • Tested on: RTX A6000 Ada, RTX 5000 Ada
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=89

Keywords

llama-cpp-python, CUDA 11.8, Python 3.11, Windows, RTX 4090, RTX 4080, RTX 4070, RTX 4060, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, L40, L40S, Ada Lovelace, sm89, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm89 Ada Lovelace - Python 3.10 - Windows x64

03 Nov 04:02
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.10.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
  • GPU: NVIDIA RTX 4060, 4060 Ti, 4070, 4070 Ti, 4070 Ti Super, 4080, 4080 Super, 4090, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, RTX 4500 Ada, RTX 4000 Ada, RTX 4000 SFF Ada, L40, L40S
  • Architecture: Ada Lovelace (sm_89)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm89.ada-cp310-cp310-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11
  • Tested on: RTX A6000 Ada, RTX 5000 Ada
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=89

Keywords

llama-cpp-python, CUDA 11.8, Python 3.10, Windows, RTX 4090, RTX 4080, RTX 4070, RTX 4060, RTX A6000 Ada, RTX 6000 Ada, RTX 5000 Ada, L40, L40S, Ada Lovelace, sm89, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm86 Ampere - Python 3.13 - Windows x64

03 Nov 03:42
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.13.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
  • GPU: NVIDIA RTX 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti, RTX A2000, RTX A4000, RTX A4500, RTX A5000, RTX A5500, RTX A6000
  • Architecture: Ampere (sm_86)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm86.ampere-cp313-cp313-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11 with RTX 3090 Ti
  • Tested on: RTX 3090 Ti, RTX 3060 Ti
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=86

Keywords

llama-cpp-python, CUDA 11.8, Python 3.13, Windows, RTX 3090, RTX 3080, RTX 3070, RTX 3060, RTX A6000, RTX A5000, RTX A4500, RTX A4000, RTX A2000, Ampere, sm86, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm86 Ampere - Python 3.12 - Windows x64

03 Nov 03:42
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.12.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
    GPU: NVIDIA RTX 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti, RTX A2000, RTX A4000, RTX A4500, RTX A5000, RTX A5500, RTX A6000
  • Architecture: Ampere (sm_86)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm86.ampere-cp312-cp312-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11 with RTX 3090 Ti
  • Tested on: RTX 3090 Ti, RTX 3060 Ti
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=86

Keywords

llama-cpp-python, CUDA 11.8, Python 3.12, Windows, RTX 3090, RTX 3080, RTX 3070, RTX 3060, RTX A6000, RTX A5000, RTX A4500, RTX A4000, RTX A2000, Ampere, sm86, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm86 Ampere - Python 3.11 - Windows x64

03 Nov 03:43
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.11.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
  • GPU: NVIDIA RTX 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti, RTX A2000, RTX A4000, RTX A4500, RTX A5000, RTX A5500, RTX A6000
  • Architecture: Ampere (sm_86)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm86.ampere-cp311-cp311-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11 with RTX 3090 Ti
  • Tested on: RTX 3090 Ti, RTX 3060 Ti
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=86

Keywords

llama-cpp-python, CUDA 11.8, Python 3.11, Windows, RTX 3090, RTX 3080, RTX 3070, RTX 3060, RTX A6000, RTX A5000, RTX A4500, RTX A4000, RTX A2000, Ampere, sm86, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

llama-cpp-python 0.3.16 + CUDA 11.8 sm86 Ampere - Python 3.10 - Windows x64

03 Nov 03:44
2c432b1

Choose a tag to compare

Pre-built llama-cpp-python wheel for Windows with CUDA 11.8 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

  • OS: Windows 10/11 64-bit
  • Python: 3.10.x (exact version required)
  • CUDA: 11.8 (Toolkit not needed, just driver)
  • Driver: NVIDIA Driver 450.80.02 or higher
  • GPU: NVIDIA RTX 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti, RTX A2000, RTX A4000, RTX A4500, RTX A5000, RTX A5500, RTX A6000
  • Architecture: Ampere (sm_86)
  • VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda11.8.sm86.ampere-cp310-cp310-win_amd64.whl

What This Solves

  • No Visual Studio required
  • No CUDA Toolkit installation needed
  • No compilation errors
  • No "No CUDA toolset found" issues
  • Works immediately with GGUF models

Tested Configuration

  • Built on: Windows 11 with RTX 3090 Ti
  • Tested on: RTX 3090 Ti, RTX 3060 Ti
  • Build date: November 2, 2025
  • llama-cpp-python version: 0.3.16
  • Build flags: CMAKE_CUDA_ARCHITECTURES=86

Keywords

lama-cpp-python, CUDA 11.8, Python 3.10, Windows, RTX 3090, RTX 3080, RTX 3070, RTX 3060, RTX A6000, RTX A5000, RTX A4500, RTX A4000, RTX A2000, Ampere, sm86, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10