Skip to content

Automated, reproducible build scripts for llama.cpp on Windows 10/11. Installs prerequisites, configures CMake and builds with CUDA.

License

Notifications You must be signed in to change notification settings

py-sandy/llama.cpp-windows-builder

Repository files navigation

llama.cpp Windows Builder (Community Scripts)

Automated, reproducible build scripts for Windows 10/11 that get you from a clean machine to a working llama.cpp toolchain with minimal friction.
These scripts can install prerequisites (Git, CMake, Visual Studio + C++ workload, CUDA) and then configure + compile llama.cpp using CMake (Ninja, Release).

License: GPLv3
Upstream project: ggml-org/llama.cpp
Clone URL: https://github.com/ggml-org/llama.cpp.git (You don't need to clone the repo yourself, the installer does this automatically for you)


✨ What’s in this repo?

  • llama_install_prerequisites_and_build.bat
    End-to-end “happy path” for a fresh Windows install.
    Installs prerequisites if missing (Winget, Git, CMake, Visual Studio Community 2022 + Desktop C++ workload, CUDA Toolkit), then clones and builds llama.cpp.

  • llama_build_only.bat
    Fast path when your machine is already set up.
    Clones the latest llama.cpp from ggml-org, configures CMake, and builds it.

  • install_cuda.bat (helper)
    Called automatically by the main script. You normally do not run this yourself.
    Detects CUDA via nvcc; if missing, installs a default CUDA version (currently 12.8).

  • Tested with RTX 3060, RTX 3090, RTX 5090 on Windows 10/11.


🚀 Quick Start

  1. Run (as Administrator)

    llama_install_prerequisites_and_build.bat
    • The script checks/installs prerequisites with winget.
    • If Visual Studio or CUDA are missing, it installs them (you may be prompted and/or a reboot may be required).
    • After prerequisites are in place, it configures CMake (Ninja, Release) and builds llama.cpp.
    • Build flags (CUDA architectures) are automatically adapted based on the detected GPU(s).
  2. Already set up? If you know Git, CMake, Visual Studio C++ workload, and (optionally) CUDA are installed:

    llama_build_only.bat

Note: At the end of a run, the script prints the path to a detailed log file (look for a line such as [LOG] Details: "...").


🧰 What gets installed (when needed)

  • Winget (via App Installer on Windows 10/11)
  • Git
  • CMake
  • Visual Studio Community 2022 + Desktop development with C++ workload
  • CUDA Toolkit (default 12.8; only installed if nvcc is not found)

The main script attempts to install only what’s missing. On Windows 10, ensure App Installer is up to date (for winget).


🏗️ Build Configuration

The scripts configure and build llama.cpp using CMake + Ninja in Release mode. Typical flags include enabling the CUDA backend. A representative configuration (subject to upstream changes and your GPU) looks like:

cmake -S . -B build -G Ninja ^
  -DGGML_CUDA=ON ^
  -DGGML_CUDA_GRAPHS=ON ^
  -DGGML_CUDA_FA_ALL_QUANTS=ON ^
  -DLLAMA_CURL=ON ^
  -DCMAKE_CUDA_ARCHITECTURES=86;120 ^
  -DCMAKE_BUILD_TYPE=Release

Output binaries (depending on the current llama.cpp layout) typically land in:

build\bin\

Expect tools like llama-cli.exe, llama-server.exe, quantize.exe, etc. (exact names may change upstream).


🔁 Updating / Rebuilding

  • To rebuild with the latest upstream changes, simply re-run llama_build_only.bat.

  • If you need a clean configure:

    • Delete the build\ directory and re-run the script.
  • If you want to force a fresh clone:

    • Delete the llama.cpp\ folder and re-run.

📦 What this repo does not do

  • It does not download model files. You’ll need compatible GGUF models placed where your runtime expects them.
  • It does not configure runtime servers or service wrappers beyond building the official tools.

Known limitations

  • Currently Windows-only (Linux planned later).
  • Winget IDs may change upstream.

✅ Requirements

  • Windows 10 (21H2+) or Windows 11
  • Administrator privileges (to install prerequisites)
  • Internet access
  • Disk space: Visual Studio + CUDA can require 10–20 GB+. The llama.cpp build itself is small by comparison.

🧯 Troubleshooting

Winget says “No package found matching input criteria”

  • Update sources: winget source update
  • Ensure App Installer is current (Microsoft Store) and winget --version works in a new terminal.

Ninja/CMake not found

  • Re-run the main script (it installs them if missing).

  • Or install manually:

    winget install Kitware.CMake
    winget install Ninja-build.Ninja

Visual Studio / C++ workload missing

  • Re-run the main script; it uses vswhere + winget to install VS Community 2022 with the required workload.

CUDA / nvcc not found

  • The main script calls install_cuda.bat automatically if nvcc is missing.
  • If you installed CUDA during the session, open a new terminal (or reboot) so PATH changes take effect.

Build fails after installing toolchains

  • Close and reopen your terminal (PATH updates).
  • Delete build\ and re-configure.
  • Ensure GPU drivers are up to date (NVIDIA).
  • Check the printed log file path for exact errors.

🔐 Safety & Notes

  • These scripts make system-level changes (installing developer tools and CUDA). Review the code and run at your own discretion and risk.
  • A reboot may be required for Visual Studio/CUDA. If prompted, accept the reboot and re-run (the main script prepares an automatic resume via RunOnce where possible).

🤝 Contributing

PRs and issues are welcome! Please:

  • Keep scripts idempotent and readable.
  • Prefer winget IDs and quiet/non-interactive installs where reasonable.
  • Test on Windows 10 and Windows 11.

🙏 Acknowledgements

  • llama.cpp by the GGML organization and open-source contributors (originally started by Georgi Gerganov).
  • Thanks to everyone building tools that make local AI accessible.

📜 License

This project is licensed under GPLv3. The upstream llama.cpp project is licensed separately; refer to its repository for terms.


📄 Legal / Disclaimer

  • This project is an independent community initiative.

  • It is not affiliated with, endorsed by, or officially associated with the original llama.cpp project or its maintainers.

  • All references to llama.cpp are purely descriptive and for compatibility purposes only.

  • This repository aims to simplify usage for local developers, while respecting the original work and licensing.

Created and maintained with 💙 by py-sandy

About

Automated, reproducible build scripts for llama.cpp on Windows 10/11. Installs prerequisites, configures CMake and builds with CUDA.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published