Automated, reproducible build scripts for Windows 10/11 that get you from a clean machine to a working llama.cpp toolchain with minimal friction.
These scripts can install prerequisites (Git, CMake, Visual Studio + C++ workload, CUDA) and then configure + compile llama.cpp using CMake (Ninja, Release).
License: GPLv3
Upstream project: ggml-org/llama.cpp
Clone URL:https://github.com/ggml-org/llama.cpp.git(You don't need to clone the repo yourself, the installer does this automatically for you)
-
llama_install_prerequisites_and_build.bat
End-to-end “happy path” for a fresh Windows install.
Installs prerequisites if missing (Winget, Git, CMake, Visual Studio Community 2022 + Desktop C++ workload, CUDA Toolkit), then clones and builds llama.cpp. -
llama_build_only.bat
Fast path when your machine is already set up.
Clones the latest llama.cpp from ggml-org, configures CMake, and builds it. -
install_cuda.bat(helper)
Called automatically by the main script. You normally do not run this yourself.
Detects CUDA vianvcc; if missing, installs a default CUDA version (currently 12.8). -
Tested with RTX 3060, RTX 3090, RTX 5090 on Windows 10/11.
-
Run (as Administrator)
llama_install_prerequisites_and_build.bat
- The script checks/installs prerequisites with winget.
- If Visual Studio or CUDA are missing, it installs them (you may be prompted and/or a reboot may be required).
- After prerequisites are in place, it configures CMake (Ninja, Release) and builds llama.cpp.
- Build flags (CUDA architectures) are automatically adapted based on the detected GPU(s).
-
Already set up? If you know Git, CMake, Visual Studio C++ workload, and (optionally) CUDA are installed:
llama_build_only.bat
Note: At the end of a run, the script prints the path to a detailed log file (look for a line such as
[LOG] Details: "...").
- Winget (via App Installer on Windows 10/11)
- Git
- CMake
- Visual Studio Community 2022 + Desktop development with C++ workload
- CUDA Toolkit (default 12.8; only installed if
nvccis not found)
The main script attempts to install only what’s missing. On Windows 10, ensure App Installer is up to date (for winget).
The scripts configure and build llama.cpp using CMake + Ninja in Release mode. Typical flags include enabling the CUDA backend. A representative configuration (subject to upstream changes and your GPU) looks like:
cmake -S . -B build -G Ninja ^
-DGGML_CUDA=ON ^
-DGGML_CUDA_GRAPHS=ON ^
-DGGML_CUDA_FA_ALL_QUANTS=ON ^
-DLLAMA_CURL=ON ^
-DCMAKE_CUDA_ARCHITECTURES=86;120 ^
-DCMAKE_BUILD_TYPE=ReleaseOutput binaries (depending on the current llama.cpp layout) typically land in:
build\bin\
Expect tools like llama-cli.exe, llama-server.exe, quantize.exe, etc. (exact names may change upstream).
-
To rebuild with the latest upstream changes, simply re-run
llama_build_only.bat. -
If you need a clean configure:
- Delete the
build\directory and re-run the script.
- Delete the
-
If you want to force a fresh clone:
- Delete the
llama.cpp\folder and re-run.
- Delete the
- It does not download model files. You’ll need compatible GGUF models placed where your runtime expects them.
- It does not configure runtime servers or service wrappers beyond building the official tools.
- Currently Windows-only (Linux planned later).
- Winget IDs may change upstream.
- Windows 10 (21H2+) or Windows 11
- Administrator privileges (to install prerequisites)
- Internet access
- Disk space: Visual Studio + CUDA can require 10–20 GB+. The llama.cpp build itself is small by comparison.
Winget says “No package found matching input criteria”
- Update sources:
winget source update - Ensure App Installer is current (Microsoft Store) and
winget --versionworks in a new terminal.
Ninja/CMake not found
-
Re-run the main script (it installs them if missing).
-
Or install manually:
winget install Kitware.CMake winget install Ninja-build.Ninja
Visual Studio / C++ workload missing
- Re-run the main script; it uses
vswhere+ winget to install VS Community 2022 with the required workload.
CUDA / nvcc not found
- The main script calls
install_cuda.batautomatically ifnvccis missing. - If you installed CUDA during the session, open a new terminal (or reboot) so PATH changes take effect.
Build fails after installing toolchains
- Close and reopen your terminal (PATH updates).
- Delete
build\and re-configure. - Ensure GPU drivers are up to date (NVIDIA).
- Check the printed log file path for exact errors.
- These scripts make system-level changes (installing developer tools and CUDA). Review the code and run at your own discretion and risk.
- A reboot may be required for Visual Studio/CUDA. If prompted, accept the reboot and re-run (the main script prepares an automatic resume via RunOnce where possible).
PRs and issues are welcome! Please:
- Keep scripts idempotent and readable.
- Prefer winget IDs and quiet/non-interactive installs where reasonable.
- Test on Windows 10 and Windows 11.
- llama.cpp by the GGML organization and open-source contributors (originally started by Georgi Gerganov).
- Thanks to everyone building tools that make local AI accessible.
This project is licensed under GPLv3. The upstream llama.cpp project is licensed separately; refer to its repository for terms.
-
This project is an independent community initiative.
-
It is not affiliated with, endorsed by, or officially associated with the original
llama.cppproject or its maintainers. -
All references to
llama.cppare purely descriptive and for compatibility purposes only. -
This repository aims to simplify usage for local developers, while respecting the original work and licensing.
Created and maintained with 💙 by py-sandy